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5.1.6 

Gompugen Ltd . 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



January 14, 2004, 07:40:11 ; Search time 2570.42 Seconds 

(without alignments) 
9398.725 Million cell updates/sec 

US-09-864-675-1 
994 

1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 

IDENTITY__NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 22781392 seqs, 12152238056 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



45562784 



Post-processing : 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



EST:* 



9 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 



ern_estba : * 

em_esthum: * 

em_estin : * 

em_estmu: * 

em_estov: * 

em_estpl : * 

em_estro : * 

em_htc: * 

gb_estl : * 
gb_est2 : * 
gb_htc: * 
gb_est3 : * 
gb_est4 : * 
gb_est5 : * 
em_estfun: * 
em_estom: * 
em_gss__hum: * 
em_gss_inv: * 
em_gss_pln: * 
em_gss_vrt : * 
em_gss_fun : * 
em_g s s _mam : * 
em_gss_mus : * 
em_gss_pro: * 
em_gss_rod: * 
em_gss_phg: * 
em_gss_vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
BI918620 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



BI918620 805 bp mRNA linear EST 16-OCT-2001 

603176570F1 NIH_MGC_121 Homo sapiens cDNA clone IMAGE: 5240969 5', 
mRNA sequence. 
BI918620 

BI918620. 1 GI: 161822 95 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 805) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Life Technologies, Inc. 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : //image . llnl . gov 
Plate: LLAM11607 row: k column: 18 
High quality sequence start: 2 
High quality sequence stop: 778. 

Location/ Qualifiers 

1. .805 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/ cl one= " IMAGE :5240969" 
/lab_host= ,, DH10B M 
/clone_lib="NIH_MGC_121" 

/note="Organ: brain; Vector: pCMV-SPORT6; Site_l: NotI; 
Site_2 : EcoRV (destroyed); RNA source anonymous pool of 3 
fetal brains, female age 2 0 weeks, female age 24 weeks, 
and male age 26 weeks. Library is oligo-dT primed and 
directionally cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.7 kb, insert size range 
0.7-3.5 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 017. Note: 
this is a NIHJVIGC Library." 
169 a 243 c 263 g 130 t 



Query Match 67. 8%; 

Best Local Similarity 98.7%; 
Matches 732; Conservative 



Score 674; DB 12; Length 805; 
Pred. No. 3.6e-156; 
0; Mismatches 5; Indels 5; 



Gaps 



5; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1 ATGAGGCGCGACCCGGCCCCCGGC-TTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTG 59 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
64 ATGAGGCGCGACCCGGCCCCCGGCGTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTG 123 

60 CTACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGA 119 
I I I I I I I I II I M I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
124 CTACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGA 183 



120 



179 



GGGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
184 GGGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCC 243 



180 



239 



GCCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
244 GCCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGG 303 



240 



304 



300 



299 



GGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCA 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
GGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCA 363 



358 



GCGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTG- 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
364 GCGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGT 423 

359 CCCGCCTC GAT AC CAACGGCAAAAAT CTCAAGAAAGAGGT GGGCAAGAT C CTGT GCACT G 418 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

424 C CC C CCT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT G 4 83 

419 ACT GC GC C AC CC GGC C CAAGT T GAAGAAGAT GAAGAG C CAGACGGGACAGGT GGGT GAGA 47 8 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I II II. 
484 ACT GC GC C AC CC G GCC CAAGTT GAAGAAGAT GAC GAGC C AGAC GGGACAGGTGGGT GAGA 543 

47 9 AGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCA 538 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
544 AGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCA 603 

539 AGGATGGCAAGGAGCT CAACCG- CAGCCGAGACATT CGCAT CAAATAT GGCAACGGCAGA 597 

I I I I I I I I II I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
604 AGGAT GGCAAGGAGCT CAACCGT CAGCCGAGACATT CGCAT CAAATAT GGCAACGGCAGA 663 

598 AAGAACT CACGACT AC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGA- GT AT GT CT G 656 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
664 AAGAACTCACGACTACAGTTCAACAAGGT GAAGGT GGAGGAC GCTGGGGAGGT AT GT CT G 723 

657 CGAGG-CCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCG 715 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
724 CGAGGCCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGTTTTACGTCAACAGGT 783 

716 TGAGCACCACCCTGTCATCCTG 7 37 

I I I I I I I I I I I I I I I I I I I I I 
784 TGAGCACCAACCTGTCATCCTG 8 05 



RESULT 2 
BM914622 
LOCUS 



BM914622 



1047 bp 



mRNA 



linear 



EST 12-MAR-2002 



DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



BASE COUNT 
ORIGIN 



AGENCOURT_6615334 NIH_MGC__113 Homo sapiens cDNA clone IMAGE : 5480308 

5', mRNA sequence. 

BM914622 

BM914622. 1 GI : 19365001 
EST. 

Homo sapiens, (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1047) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Mark Watson 
cDNA Library Preparation: Rubin Laboratory 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http://image.llnl.gov 

Plate: LLCM2002 row: p column: 05 

High quality sequence stop: 541. 
Location/Qualifiers 
1. .1047 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone=" IMAGE: 5480308" 
/lab_host="DH10B (phage-resistant ) " 
/ cl one_l ib= "NIH_MGC_1 13" 

/note="Organ: spleen; Vector: pOTB7; Site_l: Xhol; Site_2 : 
EcoRI; cDNA made by oligo-dT priming. Directionally cloned 
into EcoRl/XhoI sites using the following 5 ! adaptor: 
GGCACGAG ( G ) . Library constructed by Ling Hong in the 
laboratory of Gerald M. Rubin (University of California, 
Berkeley) using ZAP-cDNA synthesis kit (Stratagene) and 
Superscript II RT (Life Technologies). Note: this is a 
NIH_MGC Library." 
263 a 347 c 254 g 183 t 



Query Match 59.0%; 
Best Local Similarity 98.2%; 
Matches 603; Conservative 



Score 586; DB 12; Length 1047; 
Pred. No. 2.4e-134; 
0; Mismatches 10; Indels 1; 



Gaps 



l; 



Qy 



Db 



272 GCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTTTCCTGGAGCCCACGGAAC 331 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

1 GCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTTTCCTGGAGCCCACGGAAC 60 



Qy 



Db 



332 AGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACCAACGGCAAAAATCTCAAGA 391 

| I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

61 AGC CCTT AGT CT T TAAGAC GGCCT T TGCCCCCCTC GAT AC CAAC GGCAAAAAT CT CAAGA 120 



Qy 



392 AAGAGGT GGGCAAGAT C CT GT GC ACT GACT GCGC C AC C CGG C C CAAGTT GAAGAAGAT GA 451 
I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I M I I I I I I I I I I 



Db 


121 


vy 


452 




181 


Qy 


SI 9 


UD 


241 


A,, 

Qy 




LfU 


301 


Qy 


(TOO 


nh 


361 


Qy 


f;Q7 


nh 


421 


Qy 




nh 


4 ft 1 

1 O _L 


Qy 


812 


Db 


541 


Qy 


871 


Db 


601 



121 AAGAGGT GGGCAAGAT C CT GT GCACT GACT GCGC CAC C C GGC C CAAGT T GAAGAAGAT GA 180 
AGAGC CAGAC GGGACAGGT GGGT GAGAAGC7AAT C GCT GAAGT GT GAGG CAG C AGC C GGT A 511 

I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

AGAGC C AGACGGGACAGGT GGGT GAGAAGCAAT C GCT GAAGT GT GAGGC AGC AGC C G GT A 24 0 

AT C CC C AGC CTT C CT AC C GT T GGT T CAAGGAT GGCAAGGAGCT CAACC GCAGC CGAGAC A 571 
M II I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I 
ATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACA 300 

TT CGC AT CAAAT AT GGCAACGGC AGAAAGAACT CAC GACT ACAGTT CAACAAGGT GAAGG 631 
I I I M I I I I I I I I I II I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
TT CGCATCAAAT AT GGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGG 360 

T GGAGGAC GCT GGGGAGT AT GT CT GCGAGGC C GAGAACATC CT GGGGAAGGACACCGT C C 691 
| | | | | I I I I II I I I [ N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T GGAGGAC GCT GGGGAGT AT GT CT GCGAGGC CGAGAACATCCT GGGGAAGGAC ACC GT C C 420 

GGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCC 751 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I II I I I I I I I I I I 

GGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCC 480 

GGAAGT GCAAC GAGACAGCCAAGT C CT AT T GC GT CAAT GGAGG C GT CTGCT ACT ACAT C G 811 
I | | | | I I I II I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I 
GG7^GT GCAAC GAGACAGCCAAGT CCTATT GCGTCAAT GGAGGCGCCTGCT ACT ACAT CG 540 

AGGGCAT CAAC CAGCT CT C CT GCAAAT GT CCAAAT GGAT T CTT C GGACAGAGAT GTTT - G 870 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I II I I 
AGGC CAT CAAT CAGCT T T C CT GCAAATGT C C CAAT GGATT CT T C C GACCAACAT GT TT GG 600 

GAGAAACTGCCTTT 8 84 

I I I I I I I I I I I II 

GAGAAACTGCCCTT 614 



RESULT 3 

BI412864/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BI412864 1041 bp mRNA linear EST 14-AUG-2001 

602988202F1 NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE: 5144016 5', 
mRNA sequence. 
BI412864 

BI412864.1 GI:15173787 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 1041) 
NIH-MGC http : / /mgc . nci . nih . gov/ . 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Gilbert Smith, Ph.D. 

cDNA Library Preparation: M. Bento Scares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 



FEATURES 

source 



BASE COUNT 
ORIGIN 



DNA Sequencing by:Incyte Genomics, Inc. 

Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http://image.llnl.gov 

Plate: LLAM11355 row: d column: 01 
High quality sequence start: 11 
High quality sequence stop: 64 5. 

Location/Qualifiers 

1. .1041 

/organism="Mus mus cuius" 
/mol_type="mRNA" 
/strain="CZECH II" 
/db_xref="taxon: 10090" 
/clone=" IMAGE: 5144016" 
/tissue_type- ,f pooled lung tumors" 
/lab_host="DH10B (phage-resistant ) " 
/ cl one_l ib= "NCI_CGAP_Lu3 3 " 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l : NotI; Site_2 : EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5' 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 1 ] . 
Double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia) , digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization, and was 
constructed by Bento Soares and M. Fatima Bonaldo. " 
247 a 306 c 295 g 193 t 



Query Match 47.0%; Score 467.2; DB 12; Length 1041; 

Best Local Similarity 86.6%; Pred. No. 6.4e-105; 

Matches 563; Conservative 0; Mismatches 78; Indels 9; Gaps 



4; 



Qy 


165 


Db 


656 


Qy 


221 


Db 


596 


Qy 


279 


Db 


536 


Qy 


338 


Db 


476 


Qy 


398 


Db 


416 


Qy 


458 


Db 


356 



CAGCACCCGAGAGCCGCCCGCCTCGGGTCGGGT GGCGTTGGTAAAGGTGCTGGACA 22 0 

II | | II I I I I I I I I I I I I I I I Ml I I I I I I I I I 1 I I I I I I I I 

CACCTCGAGATGCGCGCCCGCCTCGGGTTCGGTTGGCGTCTTGGTGAAAGGTGCTGGACA 597 

AGTGGCCG — CTCCGGAGCGGGGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTG 27 8 

Ml | | | I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGTTGCCGGCTCCCGGATCGGGGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTG 537 

TGTGCCGCTCGAAAGGAACCAGCGCTACATCTT-TTTCCTGGAGCCCACGGAACAGCCCT 337 

M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I 

TGCGCCGCTCGAAAGGAACCAGCGCTACATCTTGTTTCCTGGAGCCCACCGAGCAGCCCT 477 

T AGTCT T T AAGAC GGC CT T TGC C C C CCT CGAT AC CAAC GGCAAAAAT CT CAAGAAAGAGG 397 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T AGTT TT T AAGAC AGC CT T TTGCCCC GGTCGAC C CT AC GGCAAAT ACAT CAAGAAAGAG G 417 

T G GGCAAGAT C CT GT GC ACT GACT GCGC CACCCGGCCCAAGT T GAAGAAGAT GAAGAGC C 457 

I I I I I I I I I I I I I I II I I I I I I I I II I II I I II I I I I II II I I I I I I I I I M I I I I I I I 

T GGGCAAGAT C CT GT GCACT GACT GC GCC AC C C GGC CCAAGCT GAAGAAGAT GAAGAGC C 357 
AGACGGGAC AGGT GGGT GAGAAGCAAT C GCT GAAGT GT GAGGC AG C AGC C GGTAAT C C C C 517 

I I I I III II I I I I I I I I I M1M II II I I I I II I I II I I II 

AGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCAAGTGTGAGGCAGCGGCGGGAAACCCCC 297 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



518 AG C CT T C CT AC CGTT GGT T CAAGGAT GGCAAGGAGCT CAAC CGCAGC C GAGACAT T C GC A 577 

I I I I I I I I I II I I Ml I I I I I I I I I I I I I I I I I I I I I I I II II II I I I I I I I 

296 AGCCCTCCTATCGCTGGTT CAAGGAT GGCAAGGAACTCAACCGGAGTCGT GAT ATT CGCA 237 

578 T CAAATAT GGCAACGGCAGAAAGAACTCACGACTACAGTT CAACAAGGTGAAGGTGGAGG 637 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I 

236 T CAAGT AT GGCAAT G GC AGAAAGAACT C AC GGCT ACAGT T CAACAAAGT GAGGGT GGAGG 177 

638 ACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCC 697 

I II MINIM I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I Mllll 

176 AT GC C GGGGAGT ACGT CT GT GAGGC C GAGAAC AT C CT T GGGAAGGACACC GT GAGGGG CC 117 

698 GGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGT 757 

I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I 

116 GACT C CAT GT CAACAGC GT GAGC AC CACTCT GT CAT C CT GGT C GGGAC AT GC C C GGAAGT 57 

758 GCAACGAGACAGCCAAGT CCTA — TTGCGTCAATGGAGGCGTCTGCTACT 8 05 
I I I I I I II I III I I I II I I I II II I I I I I I I I I I I I I I I I I I 
56 GCAAT GAGAC C GC CAAGT C CT ACC AT GT GT GAAT GGAGGCGT GT GCT ACT 7 



RESULT 4 
BX281777 
LOCUS 

DEFINITION. 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BX281777 524 bp mRNA linear EST 04-MAR-2003 

BX281777 NIH_MGC_121 Homo sapiens cDNA clone IMAGp998K18 11607 ; 
IMAGE: 524 0969, mRNA sequence. 
BX281777 

BX281777.1 "GI:28612804 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 524) 

Ebert,L., Heil,0., Hennig,S., Neubert,P., Partsch,E., Peters, M. , 

Radelof,U., Schneider, D. and Korn,B. 

Human UnigeneSet - RZPD3 

Unpublished 

Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998K1811607. 

RZPDLIB; ' I .M.A.G.E. cDNA Clone Collection; 
Human UnigeneSet - RZPD3 (RZPDLIB No. 972) 
http : //www. rzpd.de/ CloneCards/cgi- 

bin/showLib.pl.cgi/response?libNo=972 Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 

Heubnerweg 6, D-14059 Berlin, Germany 

Tel: +49 30 32639 101 

Fax: +49 30 32639 111 

www.rzpd.de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
M13u, Primer sequence: CGTTGTAAAACGACGGCCAGT . 

Location/Qualifiers 

1. .524 



/organism^"Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 

/clone="IMAGp998K1811607 ; IMAGE : 5240969" 

/lab_host="DH10B" 

/ clone_lib="NIH_MGC_12 1 " 

/note="Organ: brain; Vector: pCMV-SPORT6; Site_l: NotI; 
Site_2: EcoRV (destroyed); RNA source anonymous pool of 3 
fetal brains, female age 20 weeks, female age 24 weeks, 
and male age 26 weeks. Library is oligo-dT primed and 
directionally cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.7 kb, insert size range 
0.7-3.5 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 017. Note: 
this is a NIH_MGC Library." 

BASE COUNT 99 a 174 c 172 g 79 t 

ORIGIN 

Query Match 47.0%; Score 467; DB 13; Length 524; 

Best Local Similarity 100.0%; Pred. No. 5.5e-105; 

Matches 467; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 58 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 117 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I II I I I M I I I 

Db 118 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 177 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I II I I II I II I I I I II I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 178 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 237 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 238 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 297 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 298 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 357 

Qy 301 C GCT ACAT CT TTTT C CT GGAGC C CACGGAAC AGC C CTT AGT CTT TAAGAC GGC CT T T GC C 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 358 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 417 

Qy 361 C C CCT CGAT AC CAAC GGCAAAAAT C T CAAGAAAGAG GTGGGCAAGAT C CT GT GCACT GAC 420 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I 
Db 418 CC C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT G GGCAAGAT C CT GT GCACT GAC 477 

Qy 421 T GCGCCACC C GGCC CAAGTT GAAGAAGAT GAAGAGC C AGACGGGACA 4 67 

M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 47 8 T GC GCC AC C C GGCC CAAGT T GAAGAAGAT GAAGAGC CAGAC GG GAC A 524 



RESULT 5 



AA706226 549 bp mRNA linear EST 12-JAN-1999 

ah28a07.sl Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 
1240116 3' similar to TR:P43328 P43328 NEU DIFFERENTIATION FACTOR 
NDF04 ;, mRNA sequence. 
AA706226 

AA706226. 1 GI: 2716144 
EST. 

Homo s apiens ( human ) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 549) 

NCI-CGAP http : //www. ncbi . nlm.nih.gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima Bonaldo 
, Ph.D. 

cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
www-bio . llnl . gov/bbrp/ image/ image . html 

Possible reversed clone: similarity on wrong strand 
Possible reversed clone: polyT not found 
Insert Length: 689 Std Error: 0.00 
Seq primer: -4 0ml 3 fwd. ET from Amersham 
High quality sequence stop: 451. 
FEATURES Location/Qualifiers 

source 1. .549 J 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="1240116" 

/t is sue_type="par a thyroid tumor" 
/dev_stage= ,, adult" 

/lab_host="DH10B (ampicillin resistant)" 
/clone_lib= ,! Soares_parathyroid_tumor_NbHPA" 

/note="0rgan: parathyroid gland; Vector: pT7T3D (Pharmacia 
) with a modified polylinker; Site_l : Not I; Site_2 : Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[ 5 1 -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded c DNA was size selected, ligated 
to Eco RI adapters- (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M. Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH. " 

BASE COUNT 137 a 163 c 156 g 92 t 1 others 



AA7 06226 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



ORIGIN 



Query Match 40.6%; Score 404; DB 9; Length 549; 

Best Local Similarity 98.5%; Pred. No. 2.2e-89; 

Matches 4 07; Conservative 0; Mismatches 6; Indels 0; Gaps 0; 

Qy 424 GCCAC C C GGCC CAAGT T GAAGAAGATGAAGAGC C AGAC GGGAC AGGT GG GTGAGAAGCAA 4 83 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 15 GCCAC C C GG CC CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAGCAA 74 

Qy 484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I I I I I I I I. I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I 

Db 75 TCGCTGAAGTGTGAGGCAGCAGCGGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 134 

Qy 544 GGCAAGGAGCT CAAC C GCAGC C GAGACAT T CGCATCAAAT AT GGCAAC GGCAGAAAGAAC 60.3 

I I I I I I I II I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I i I I I I I I II I I I I 
Db 135 GGCAAGGAGCT CAAC C GCAGC C GAGACATT C GC AT CAAAT AT GGCAAC GGCAGAAAGAAC 194 

Qy 604 T C AC GACT AC AGT T CAACAAGGT GAAGGT GGAG GAC GCT GGGGAGT AT GT CTGC GAGGC C 663 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 195 T CACGACTACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CTGCGAGGCC 254 

Qy 664 GAGAAC AT CCT GGGGAAGGAC ACCGT CC GGGGC C GGCT T T AC GT CAAC AGC GT GAGCAC C 723 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I M I I I I I I I Ml 

Db 255 GAGAAC AT C CT GGGGAAGGAC AC C GT CGGAGGC C GGCTT T AC GTCAAC AGC GT GAC GAC C 314 

Qy 724 ACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGC 7 83 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 315 ACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGNGACAGCCAAGTCCTATTGC 37 4 

Qy 7 84 GTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAA 836 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 37 5 GTCAAT GGAGGCGT CT GCTACT ACAT CGAGGGCATCAACCAGCTCTCCTGCAA 427 



RESULT 6 
AI041451 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AI041451 412 bp mRNA linear EST 28-AUG-1998 

ow36c02.sl Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 
IMAGE: 1648898 3* similar to TR:014511 014511 NTAK. ;, mRNA 
sequence . 
AI041451 

AI 041451.1 GI: 3280645 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 412) 

NCI-CGAP http : //www . ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima Bonaldo 
, Ph.D. 



cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

www-bio. llnl . gov/bbrp/ image/ image.html 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Trace considered overall poor quality 
Insert Length: 671 Std Error: 0.00 
Seq primer: -40ml3 fwd. ET from Amersham 
High quality sequence stop: 1. 

Location/Qualifiers 

1. .412 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 1648898" 
/tissue_type="parathyroid tumor" 
/dev_stage="adult" 

/lab_host="DH10B (ampicillin resistant)" 
/clone_lib="Soares_parathyroid_tumor_NbHPA" 

/note="Organ: parathyroid gland; Vector: pT7T3D (Pharmacia 
) with a modified polylinker; Site_l : Not I; Site_2: Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[5 ' -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded c DNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia). Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M.Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH." 

a 108 c 126 g 65 t 1 others 



112 



Query Match 39.8%; 
Best Local Similarity 97.6%; 
Matches 401; Conservative 



Score 395.6; DB 9; 
Pred. No. 2.4e-87; 
0; Mismatches 10; 



Length 412; 



Indels 



0; Gaps 



Qy 


426 


Db 


1 


Qy 


486 


Db 


61 


Qy 


546 


Db 


121 


Qy 


606 


Db 


181 



CAC C C GGC C CAAGTT GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAGCAATC 

I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I M I I 

CACC C GGCC CAAGTT GAAGAAGAT GAAGAGC CAGACGGGACAGGT GG GT GAGAAGCAAT C 

GCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGG 
| | | | | | | | | | | I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCT GAAGT GT GAGGC AGCAGC GAT AAAT CC C CAGC CTT CCT AC C GTT GGT T CAAGGAT GG 

CAAGGAGCT CAAC C GCAGCC GAGACATT C GCAT CAAATATGGCAACGGC AGAAAGAACT C 

I I I I I M I I I II I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

CAAGGAGCT CAAC CGCAGCC GAGACATT C GCAT CAAAT AT GGCAAC GGC AGAAAGAACT C 
AC GACT AC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC C GA 

I I I I I I I M I I I I I I I I I II I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I 

AC GACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC C GA 



0; 



485 



60 



545 



120 



605 



180 



665 



240 



666 GAAC AT C CT GGG GAAGGAC AC C GT C CGGGGCCGGCT T T AC GT CAAC AGC GT GAGCACC AC 725 

I I I I I I I M | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
241 GAACAT C CT GGGGAAGGAC AC C GT AC GAGGCCGGCTTT AC GT CAACAGC GT GAC GAC CAC 300 

726 CCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGT 785 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

301 CCTGTCATCCTGGTCGGGGCACGCCGGGAAGTGCAACGNGACAGCCAAGTCCTATTGCGT 360 

786 CAAT GGAGGCGT CT GCT ACTACAT CGAGGGCATCAACCAGCTCT CCT GCAA 836 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
361 CAAT GGAGGCGT CT GCT ACT ACATCGAGGGCAT CAAC CAGCT CT CCTGCAA 411 



RESULT 7 
BX529505 

ID BX529505 standard; RNA; EST; 488 BP. 
XX 

AC BX529505; 
XX 

SV BX529505.1 
XX 

DT 27-MAY-2003 (Rel. 75, Created) 

DT 27-MAY-2003 (Rel. 75, Last updated, Version 1) 
XX 

DE RZPD Mus musculus cDNA clone IMAGp998N017639 = IMAGE: 3153984 5* EST. 
XX 

KW EST; expressed sequence tag. 
XX 

OS Mus musculus (house mouse) 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; Mammalia; 

OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

XX 

RN [1] 

RP 1-488 

RA Heil O., Ebert L., Neubert P., Peters M. , Radelof U., Schneider D., 

RA Korn B. ; 

RT 

RL Submitted (28-MAY-2003 ) to the EMBL/ GenBank/DDBJ databases. 

RL RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH Im Neuenheimer 

RL Feld 580, D-69120 Heidelberg, Germany 

XX 

CC RZPD; IMAGp998N017639. 

CC RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 

CC Mouse UnigeneSet - RZPD2 (RZPDLIB No. 981) 

CC http: //www. rzpd.de/CloneCards/cgi-bin/showLib. pi . cgi/response?libNo=981 

CC Contact: Ina Rolfs 

CC RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 

CC Heubnerweg 6, D-14059 Berlin, Germany 

CC Tel: +49 30 32639 101 

CC Fax: +49 30 32639 111 

CC www.rzpd.de 

CC This clone is available royalty-free from RZPD; 

CC contact RZPD (clone@rzpd.de) for further information. 

CC Seq primer: SP6, Primer sequence: ATTTAGGTGACACTATAG 

XX 

FH Key Location/Qualifiers 
FH 



FT source 1. .488 

FT /db_xref="taxon: 10090" 

FT /note="Cloned unidirectionally . Primer: Oligo dT. Average 

FT insert 2 kb. Library constructed by Life Technologies, 

FT catalog #12017-018. Investigators providing samples:, Lothar 

FT Hennighausen/Chu-Xia Deng, NIH Reference for transgenic 

FT model: Xu et al., Nature Genetics 22, 37-43 (1999). Note: 

FT this is a NCI_CGAP Library 

FT <http : //www. ncbi . nlm. nih . gov/ ncicgap/> . " 

FT /organism="Mus musculus" 

FT /clone="IMAGp998N017639" 

FT /clone_lib="NCI_CGAP_Mam3 mammary tumor" 

FT /lab_host="DH10B" 

XX 

SQ Sequence 488 BP; 129 A; 116 C; 149 G; 94 T; 0 other; 

Query Match, 38.3%; Score 381; DB 4; Length 488; 

Best Local Similarity 91.0%; Pred. No. le-83; 

Matches 405; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

GT GGGT GAGAAGCAAT C GCT GAAGT GT GAGGCAG C AGCC GGTAAT C C CC AGCCTT C CT AC 52 8 

I I 1 I I I I I III I I I Mill I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I 

GTGGGTGAGAAGCAGTCGCTCAAGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTAT 60 

C GTT GGT T CAAGGAT GGCAAGGAGCT CAAC C GCAGC CGAGACATT C GC AT CAAAT ATGGC 58 8 
|| | | | | | | | | I I I I I I I I I I II I I I I I I I I II M II I I I I I I I I I II I I I M I 
C GCT GGT T CAAGGAT GG CAAGGAACT CAACC GGAGT C GT GAT ATT CGCAT CAAGT ATGGC 120 

AACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAG 64 8 

II | M I I I I I I I I M I I I I I M I I I I I I I I M I I I I I I I I I I I I M I I I 

AATGGCAGAAAGAACT CACGGCTACAGTTCAACAAAGTGAGGGTGGAGGAT GCCGGGGAG 18 0 

TATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC 7 08 

II M | I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I M I 

T AC GT CT GT GAGGCCGAGAACAT CCT T GGGAAGGAC AC CGT GAGGGGC CGACT C CAT GT C 240 

AACAGCGT GAGCACCACCCTGTCAT CCT GGT CGGGGCACGCCCGGAAGTGCAACGAGACA 768 
I | | | | | | I I II I I I I I I I I I I I I I I II I I I I I I I M I I I I II I I I I I II I I I I I I 
AAC AGC GT GAGC AC C ACT CT GT CAT C CT GGT C GGGACAT GC CC GGAAGT GCAAT GAGACC 300 

GCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATC7UVCCAGCTC 828 
I I I I I I I II II I I M I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I II 
GCCAAGT C CTACT GT GT GAAT GGAGGCGT GT GCT ACT ACATC GAGGGCAT CAAC CAGCT C 360 

TCCTGCAAATGT CCAAAT GGATT CTT CGGACAGAGAT GTTTGGAGAAACTGCCTTT GCGA 88 8 
| [ | | | | | | | || I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I 
T CCT GCAAAT GT C CAAAC GGATT CT T C GGACAGAGAT GT T T GGAGAAACT GC CTT T GC GA 420 

TTGTACATGCCAGATCCTAAGCAAA 913 

I I I I I I M I I I I I I I I I I I I I I I I I 
TTGT AC AT GC C AGAT C CTAAGCAAA 445 



Qy 


4 by 


JJD 


"1 
X 


Qy 


529 


Db 


61 


Qy 


589 


Db 


121 


Qy 


649 


Db 


181 


Qy 


709 


Db 


241 


Qy 


769 


Db 


301 


Qy 


829 


Db 


361 


Qy 


889 


Db 


421 



RESULT 8 
BF108794 

LOCUS BF108794 427 bp mRNA linear EST 20-OCT-2000 

DEFINITION 7152g03.xl Soares_NSF_F8_9W_OT_PA_P_Sl Homo sapiens cDNA clone 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



Craniata ; Vertebrata ; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



BASE COUNT 
ORIGIN 



IMAGE: 3525292 3* similar to SW : NTAK_HUMAN 014511 NTAK PROTEIN 
; contains MSRl.tl MSR1 repetitive element ;, mRNA sequence. 
BF108794 

BF108794.1 GI:10938484 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 427) 

NCI-CGAP http : / /www . ncbi . nlm . nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih. gov 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Seq primer: -40UP from Gibco 
High quality sequence stop: 396. 

Location/Qualifiers 

1. .427 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clones "IMAGE: 3525292" 
/lab_host-"DH10B" 

/clone_lib="Soares_NSF_F8_9W__OT_PA_P_Sl" 

/note="Organ: pooled; Vector: pT7T3D-Pac (Pharmacia) with 
a. modified polylinker; Site__l: Not I; Site_2 : Eco RI; 
Equal amounts of plasmid DNA from five normalized 
libraries were mixed, and ss circles were made in vitro. 
Following HAP purification, this DNA was used as tracer in 
a subtractive hybridization reaction. The driver was 
PCR- amplified cDNAs from pools of 5,000 clones made from 
the same 5 libraries. The pools consisted of the following 
libraries and clonelDs : Soares NbHSF pool 1: 
309384-310919, 323208-325895 Soares Nb2HP pool 1: 
145032-147335, 147720-148103, 148872-149255, 15002 - 
150407, 151176-152327 Soares Nb2HF8-9W pool 1: 
758280-760583, 772104-774407 Soares NbHPA pool 1: 
304776-306311, 320136-322823, 326280-326663 Soares NbHOT 
pool 1: 723720-726407, 739080-740999 Subtraction by Bento 
Soares and M. Fatima Bonaldo." 
114 a 112 c 145 g 56 t 



Query Match 36.6%; Score 363.8; DB 10; Length 427; 

Best Local Similarity 91.3%; Pred. No. 1.8e-79; 

Matches 386; Conservative 0; Mismatches 37; Indels 0; 



Gaps 



0; 



Qy 



Db 



363 CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 

I I I I I I I III I I I I I I I I I I I I II 

5 C C GCGGCAAGAAGC AC CCAGAGGGGAGGAAGC GGGAGAGGGAGCC C GAT C C C GGGGAGAA 64 



Qy 



423 CGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCA 482 
I I I I I I I I I M I I II II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 


65 




483 


Db 


125 




543 


Db 


185 


yy 


\J \J o 


Db 


245 


flu 

yy 


D U O 


Db 


305 


Qy 


723 


Db 


365 


Qy 


783 


Db 


425 



AGCCAC C C GG CC CAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGC A 124 
ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I II M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I M M I I I 

ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 184 
T GGCAAGGAGCT CAAC C GC AGC C GAGAC AT T C GC AT CAAAT AT GGCAAC GGCAGAAAGAA 602 

I | | | | | I I I I I I I I I I I I I II I I I I I I II I I I II I I I I I I I II I I I I I I I I I I I I I I I M 

T GGCAAGGAG CT CAAC CGC AGC CGAGACAT T C GCAT CAAAT AT GGCAAC GGCAGAAAGAA 244 

CT C AC GACT AC AGTTCAACAAGGT GAAGGT GGAGGACG CTGGGGAGT AT GT CT GC GAGGC 662 
| | | | I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CTCACGACTACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTAT GT CT GCGAGGC 304 

C GAGAACAT C CT G GGGAAGGAC AC C GT CCGGGGCC GGCTTT ACGT CAACAGCGT GAGC AC 722 
| | | I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I II 
CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 364 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 7 82 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 424 



I I 



RESULT 9 

BI410828/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 



BI410828 949 bp mRNA linear EST 14-AUG-2001 

602963734F1 NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE : 5119065 5', 
mRNA sequence. 
BI410828 

BI410828.1 GI:15171751 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 949) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Gilbert Smith, Ph.D. 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by:Incyte Genomics, Inc. 

Clone distribution: NCI-CGAP clone distribution information can be 
found "through the I.M.A.G.E. Consortium/LLNL at: 
http : //image. llnl . gov 

Plate: LLAM11290 row: d column: 10 
High quality sequence start: 2 8 
High quality sequence stop: 919. 
Location/ Qualifiers 



source 



BASE COUNT 
ORIGIN 



171 



1. .949 

/organism="Mus mus cuius" 
/mol_ type="mRNA" 
/strain="CZECH II" 
/db_xref="taxon: 10090" 
/clone=" IMAGE: 5119065" 
/tissue_type="pooled lung tumors" 
/lab_host="DH10B (phage-resistant) " 
/clone_lib= ,, NCI_CGAP_Lu33" 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l: NotI; Site_2: EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5 1 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 1 ] . 
Double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization, and was 
constructed by Bento Soares and M. Fatima Bonaldo. " 
a 261 c 269 g 248 t 



Query Match 31. 9%; 

Best Local Similarity 84.2%; 
Matches 442; Conservative 



Score 317.4; DB 12; 
Pred. No. 7.6e-68; 
0; Mismatches 71; 



Length 949; 
Indels 12; 



Gaps 



7; 



Qy 



Db 



397 GTGGGCAAGATCCTGTGCACTGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGC 456 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

947 GTGGGCCAGATCCTGGGCACTG-CTGCGCCACCCGCCCCAA-CTGAAGAAGATGAAGA-C 891 



Qy 

Db 

Qy 

Db 

Qy 

Db 



457 



890 



517 



830 



C AGACGGGAC AGGT G GGT GAGAAGCAAT C GCT GAAGT GT GAGGCAGCAGC C GGT AAT C C C 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAAACCAGAAGAGTCGGT GAGAACAGTTCGCT CAAGTGT GAGGCACGGCCGGGGAAACCC 



516 



831 



570 



CAGCCTTCCTACC GT T GGTT CAAGGAT GGC - AAGGAGCT CAAC C GCAGC C GAGAC 

I | Mill I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

C CCCAC C C CT CCCT AT CGCTGGTT T CAAG GAT GGCAAAGGAACTCAAC C GGAGT C GT GAT 771 



571 



ATTCGCATCAAATAT GGCAACGGCAGAAAGAACT CACGACTACAGTT CAACAAGGT GAAG 630 
I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I II I I I I I I I I I I I II 
77 0 AT T CGC AT CAAGT AT GCCAAT GGC AGAAAGAACT C ACGGCT ACAGT T CAACAAAAGT GAG 711 



631 GT — GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC C GAGAAC AT C CT GGGGAAGGACAC C G 68 8 



Db 710 GT T GGAGGATT GC CGGGGAGT AC GT CT GT GAGGCC GAGAACAT C CTT GGGAAGGAC AC C G 651 

Qy 68 9 TCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACG 748 

I I I II II II I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I II I 
Db 650 T GA- GGGC C GACTC CAT GT CAACAGC GT GAGC AC C ACT CT GT CAT C CT GGT C GGGAC AT G 592 

Qy 74 9 C C C GGAAGT GCAACGAGACAGC CAAGT CCT ATT GC GT CAAT GGAGGCGT CT GCT ACT AC A 808 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II II I I I I I I I I I I I I I I I I I I I I I 
Db 591 C C CGGAAGT GCAAT GAGAC CGCCAAGT C CT ACT GT GT GAAT GGAGGC GT GT GCT ACT AC A 532 



Qy 

Db 



809 
531 



868 
472 



Qy 8 69 T GGAGAAACT GCCT TT GC GATT GT AC AT GC CAGAT C CT AAGCAAA 913 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I 
Db 1 471 T GGAGAAACT GC CTTT GC GAT T GT AC AT GC CAGAT C CTAAGCAAA 427 



RESULT 10 

BI651936 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



BI651936 795 bp mRNA linear EST 12-SEP-2001 . 

603298677F1 NCI_CGAP_Mam3 Mus musculus cDNA clone IMAGE : 5339251 5*, 
mRNA sequence. 
BI651936 

BI651936.1 GI:15566172 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Roclentia; Sciurognathi; Muridae; Murinae; Mus. 
1 (bases 1 to 795) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Lothar Hennighausen Ph.D., Chu-Xia Deng Ph.D. 
cDNA Library Preparation: Life Technologies, Inc. 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http : //image . llnl . gov 

Plate: LLAM11861 row: j column: 20 
High quality sequence stop: 795. 

Location/Qualifiers 

1. .795 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="129,C57BL/6J, FVB/N" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 5339251" 
/tissue_type=" tumor, gross tissue" 
/dev_stage="10 months" 
/lab_host="DH10B" 
/clone_lib="NCI_CGAP_Mam3" 

/note="Organ: mammary; Vector: pCMV-SP0RT6; Site__l: Sail; 
Site_2: NotI; Cloned unidirectionally . Primer: Oligo dT . 
Library constructed by Life Technologies. Investigators 
providing samples: Lothar Hennighausen/Chu-Xia Deng, NIH 
Reference for transgenic model: Xu et al., Nature Genetics 
22, 37-43 (1999) . " 
204 a 226 c 219 g 146 t 



Query Match 27.4%; 
Best Local Similarity 92.3%; 
Matches 298; Conservative 



Score 272.6; DB 12; Length 795; 
Pred. No. 8.8e-57; 
0; Mismatches 24; Indels 1; Gaps 



l; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



592 GGCAGAAAGAACT CACGAC- TACAGTT CAACAAGGTGAAGGT GGAGGACGCT GGGGAGT A 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
1 GGCAGAAAGAACT CACGGCT TACAGTT CAACAAAGT GAG GGT GGAGGAT GCC GGGGAGT A 



651 



61 



650 



60 



710 



TGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
C GT CT GT GAGGC C GAGAAC AT C CT T GGGAAGGAC AC C GT GAGGGGC C GACT C CAT GT C AA 120 



711 



CAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGC 770 

I I I I I I I I I I I I II I 11111111111111111 II 11111111111111 I I I I I II 

121 CAGC GT GAGCAC C ACT CT GT C AT CCTGGT CGGGAC AT GC CC GGAAGT GCAAT GAGAC C GC 18 0 

771 CAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTC 830 

I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 CAAGT C CT ACT GT GT GAAT GGAGGC GT GT GCT ACT ACAT CGAGGGCAT CAAC C AGCT CT C 240 

831 CT GCAAAT GT C CAAAT GGAT T CT T CGGAC AGAGAT GT T T GGAGAAACT GC CT TT GC GATT 890 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 CT GCAAATGT C CAAACGGAT T CT T CGGAC AGAGAT GT T T GGAGAAACT GC CT TT GC GATT 300 



891 



913 



GTACAT GCCAGAT CCTAAGCAAA 

I I I I I I I I I I II I I I I I I I I I I I 
301 GTACAT GCCAGAT CCTAAGCAAA 323 



RESULT 11 

BE983573 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



BE983573 333 bp mRNA linear EST 29-APR-2002 

UI-M-CGOp-bgi-c-07-O-UI . si NIH_BMAP__Ret4_S2 Mus musculus cDNA clone 
UI-M-CG0p-bgi-c-07-0-UI 3 1 , mRNA sequence. 
BE983573 

BE983573. 1 GI: 10654893 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 333) 

Bonaldo, M. F. , Lennon,G. and Soares, M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery- 
Genome Res. 6 (9), 791-806 (1996) 
97044477 
•8889548 

Contact: Chin, H 

National Institute of Mental Health 

6001 Executive Blvd. Room 7N-7190, MSC 9643, Bethesda, MD 

20892-9643, USA 

Tel: 301 443 1706 

Fax: 301 443 9890 

Email: mEST@mail.nih.gov 

Oligo-dT track not found, Not I site shown in beginning of sequence 
is likely internal to the message. cDNA Library Preparation: M.B. 
Soares Lab Clone distribution: Researchers may obtain BMAP cDNA 
clones from RESEARCH GENETICS. It should be noted that Bento Soares 
is generating a small number of additional specialized 
non-redundant arrays of BMAP cDNAs whose availability will be 



FEATURES 

source 



BASE COUNT 
ORIGIN 



considered under appropriate and limited collaborative arrangements 
The tissue for this library was contributed by Dr. Xin-Yuan Fu, 
Yale University School of Medicine The following repetitive 
elements were found in this cDNA sequence: 15-105, 
>GC_rich#Low_complexity 
Seq primer: M13 Forward 
POLYA=No . 

Location/Qualifiers 
1. .333 

/organism="Mus mus cuius" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/db_xref="taxon: 10090" 

/clone="UI-M-CG0p-bgi-c-07-0-UI" 

/lab_host="DH10B (Life Technologies)" 

/clone__lib="NIH_BMAP_Ret4_S2" 

/note="Vector : pT7T3D-Pac (Pharmacia) with a modified 
polylinker; Site_l: Not I; Site_2: Eco RI; The 
NIH_BMAP_Ret4_S2 library is a subtracted library, 
ultimately derived from mouse retina tissue libraries at 
various stages of development. For a detailed description 
of the library from which this clone was derived, please 
visit our web site at brainest.eng.uiowa.edu. The tissue * 
for this library was contributed by Dr. Xin-Yuan Fu, Yale 
University School of Medicine 
TAG_SEQ=None found" 
47 a 124 c 122 g 40 t 



Query Match 23.6%; 
Best Local Similarity 95.6%; 
Matches 241; Conservative 



Score 234.4; DB 10; 
Pred. No. 1.8e-47; 
0; Mismatches 11; 



Length 333; 
Indels 0; 



Gaps 



0; 



Qy 

Db 



82 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 141 



Qy 

Db 

Qy 

Db 

Qy 

Db 



61 T ACT C GC C CAGC CTCAAGT C AGT GCAGGAC C AGGCGT ACAAGGCACC C GT GGT GGT GGAG 120 
I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
142 TACT C GC C CAGCCT CAAGT C GGT G CAGGAC CAGGCGT ACAAGGCACC CGT GGT GGT GGAG 201 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I 

202 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCG 261 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

II I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 62 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 321 



Qy 



Db 



241 GGGCTGCAGCGC 252 

I I I I I I I I I I I I 
322 GGGCTGCAGCGC 333 



RESULT 12 

AW476657 

LOCUS 



AW476657 



529 bp mRNA linear EST 24-FEB- 



2000 



DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



uq79e01.yl NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE : 2937336 5 ! 
similar to TR:O35073 O35073 NTAK ALPHA2-1P ;, mRNA sequence. 
AW476657 

AW476657.1 GI:7046763 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 529) 

NCI-CGAP http : //www. ncbi . nlm. nih . gov/ ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Other_ESTs: uq79e01.xl 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Gilbert Smith, Ph.D. 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

cDNA Library Arrayed by: The I. M. A. G.E. Consortium (LLNL) 

DNA Sequencing by: Washington University Genome Sequencing Center 

Clone distribution: NCI-CGAP clone distribution information can be 

found through the I. M. A. G.E. Consortium/ LLNL at: 

www-bio . llnl . gov/bbrp/ image/ image . html 



FEATURES 

source 



BASE COUNT 
ORIGIN 



MGI: 1049756 

Seq primer: -4 0RP from Gibco 
High quality sequence stop: 459. 

Location/Qualifiers 

1. .529 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="CZECH II" 
/db_xref="taxon: 10090" 
/ clones " IMAGE :2937336" 
/tissue_type="pooled lung tumors" 
/lab_host="DH10B (phage-resistant ) " 
/clone_lib="NCI_CGAP_Lu33" 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l: NotI; Site_2 : EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5* 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 1 ] . 
Double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization, and was 
constructed by Bento Soares and M. Fatima Bonaldo. " 
136 a 144 c 146 g 103 t 



Query Match 23.4%; Score 232.2; DB 9; Length 529; 

Best Local Similarity 93.1%; Pred. No. 7.5e-47; 

Matches 243; Conservative 0; Mismatches 18; Indels 0; 



Gaps 



0; 



Qy 653 TCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACA 712 



Db 



2 



61 



Qy 713 GCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCA 772 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 62 GC GT GAGC AC CACT CT GT CAT C CT GGT CGGGACAT G C C CGGAAGTGCAAT GAGAC C G CCA 121 

Qy 773 AGT C CT AT T GC GT CAAT GGAGGCGT CT GCT ACT ACAT C GAGGGC AT CAAC C AGCT CT CCT 832 

I I I I I I I II II I It i I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 122 AGTCCTACTGTGTGAAT GGAGGCGT GTGCTACTACATCGAGGGCATCAACCAGCTCT CCT 181 

Qy 833 GCAAAT GTC CAAAT GGATT CT T CGGACAGAGAT GTT T GGAGAAACT GC CT T T GC GATTGT 892 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I 
Db 182 GCAAAT GT CCAAACGGATT CTT CGGAC AGAGATGT T T G GAGAAACT GCCT T TGC GAT TGT 241 

Qy 893 AC AT GC CAGAT CCTAAGCAAA 913 

I II I I I I I I I I I I I I II I I I I 
Db 242 ACAT GC CAGAT C CTAAGCAAA 262 



RESULT 13 

AA772412 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AA772412 297 bp mRNA linear EST 31-DEC-1998 

ai44el2.sl Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 
1359886 3' similar to TR:P43328 P43328 NEU DIFFERENTIATION FACTOR 
NDF04 ;, mRNA sequence. 
AA772412 

AA772412.1 GI:2824195 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 297) 

NCI-CGAP http : / /www . ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) r 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima Bonaldo 
, Ph.D. 

cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

www-bio . llnl . gov/bbrp/image/ image . html 



Possible reversed clone: similarity on wrong strand 
Possible reversed clone: polyT not found 
Insert Length: 667 Std Error: 0.00 
Seq primer: -4 0ml 3 fwd. ET from Amer sham 
High quality sequence stop: 267. 
FEATURES Location/Qualifiers 
source 1. .297 

/organism="Homo sapiens" 
/mol_type= ,, mRNA" 



BASE COUNT 
ORIGIN 



/db_xref="taxon: 9606" 
/clone="1359886" 

/tissue_type="parathyroid tumor" 
/dev_stage="adult" 

/lab_host="DH10B (ampicillin resistant)" 
/clone_lib="Soares_parathyroid_tumor_NbHPA" 

/note="Organ: parathyroid gland; Vector: pT7T3D (Pharmacia 
■) with a modified polylinker; Site_l: Not I; Site_2 : Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[ 5 1 -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot - 5. Library constructed by Bento 
Soares and M.Fatima Bonaldo . RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH . " 

37 a 68 c 105 g 37 t 



Query Match 22.5%; Score 224; DB 9; Length 2 97; 

Best Local Similarity 92.2%; Pred. No. 6.5e-45; 

Matches 236; Conservative 0; Mismatches 20; Indels 



0; Gaps 



0; 



Qy 

Db 



4 05 GAT CCT GT GC ACT GACT GCGCC ACC C GGCC CAAGT T GAAGAAGATGAAGAGC C AGAC GGG 464 
Mil l III I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I 

42 GAGCC CGAT C C C GGGAGAAAGC CAC CC GGCCAAGTT GAAGAAGAT GAAGAGC C AGAC GGG 101 



Qy 

Db 

Qy 

Db 



465 ACAGGT GGGT GAGAAGCAAT CGCT GAAGT GT GAGGC AGC AGC CGGTAAT C C C CAGC CT T C 524 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
102 ACAGGT GGGT GAGAAGCAAT CGCT GAAGT GT GAG GC AGCAGC GGTGAAT C C C CAGC CT T C 161 

525 CT ACC GT T GGTT CAAGGAT GGCAAGGAGCT CAAC C GC AGC CGAGAC ATT C GCATCAAAT A 584 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I 

162 CT ACCGT T GGT T CAAGGAT GGCAAGGAGCT CAAC CGCAGCC GAGACATT C GCAT CAAAT A 221 



Qy 



Db 



585 T GGCAACGGCAGAAAGAACT CAC GACT ACAGT T CAACAAGGT GAAGGT GGAGGAC GCT GG 644 

I I I I I I I I I M I II I I I I I I I I I I I I II I I II I II I I I I I I I I I I I I I I I I I M I I I I I I 
222 T GGCAAC GGC AGAAAGAACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GG 281 



Qy 



Db 



645 GGAGTAT GT CTGCGAG 660 

II I I I I I I I I I II I I I 
282 GGAGTATGTCTGCGAG 2 97 



RESULT 14 

BX089049/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



BX089049 362 bp mRNA linear EST 23-JAN-2003 

BX089049 Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 
IMAGp998M133119 ; IMAGE : 1240116 , mRNA sequence. 
BX089049 

BX089049.1 GI:27825909 
EST . 

Homo sapiens (human) 



ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



Craniata ; Vertebrata ; Euteleos tomi ; 
Catarrhini; Hominidae; Homo. 

Neubert, P., Partsch,E., Peters, M., 



BASE COUNT 
ORIGIN 



Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 362) 
Ebert,L., Heil,0., Hennig,S., 
Radelof,U., Schneider, D. and Korn,B. 
Human UnigeneSet - RZPD3 
Unpublished 
Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998M133119. 

RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 
Human UnigeneSet - RZPD3 (RZPDLIB No. 972) 
http: //www. rzpd.de/CloneCards/ cgi- 

bin/showLib.pl . cgi/response?libNo=972 Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 

Heubnerweg 6, D-14059 Berlin, Germany 

Tel: +49 30 32639 101 

Fax: +49 30 32639 111 

www.rzpd.de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
M13r, Primer sequence: TTTCACACAGGAAACAGCTATGAC . 

Location/Qualifiers 

1. .362 

/organism="Homo sapiens" 
/mol_type="mRNA M 
/db_xref="taxon: 9606" 

/clone="IMAGp998M133119 ; IMAGE : 124 0116" 
/tissue_type="parathyroid tumor" 
/dev_stage="adult" 

/lab_host="DH10B (ampicillin resistant)" 
/clone_lib- !, Soares_parathyroid_tumor_NbHPA" 

/note="Organ: parathyroid gland; Vector: pT7T3D (Pharmacia 
) with a modified polylinker; Site_l: Not I; Site_2: Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[ 5 1 -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3 1 ] , double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia) , digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M.Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH." 

64 a 100 c 120 g 77 t 1 others 



Query Match 18.1%; 
Best Local Similarity 99.4%; 
Matches 180; Conservative 



Score 180; DB 13; Length 362; 
Pred. No. 5.5e-34; 
0; Mismatches 1; Indels 



0; Gaps 



0; 



Qy 



656 GCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCG 715 
I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



362 GC GAGGC C GAGAAC AT C CT GGGGAAGGAC AC C GT CCG GGN C CGGC T T T AC GT CAACAGC G 303 



Qy 

Db 

Qy 

Db 

Qy 

Db 



716 T GAGCACCAC CCT GT C AT CCT GGT C GGGGC AC GC CCGGAAGT GCAAC GAGAC AG CCAAGT 775 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I II I I I I I I I I 
302 T GAGCAC CACCCT GT CAT CCT G GT C GGGGC ACGC CCGGAAGT GCAAC GAGAC AGCCAAGT 243 

776 CCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCA 835 

M I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
242 CCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCA 183 

836 A 836 
I 

182 A 182 



RESULT 15 

AW762061 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AW762061 



256 bp 



mRNA 



linear EST 04-MAY-2000 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus . 



ur53c01.yl NCI_CGAP_Mam3 Mus musculus cDNA clone IMAGE: 3153984 5' 
similar to TR: 035073 O35073 NTAK ALPHA2-1P ; , mRNA sequence. 
AW762061 

AW762061.1 GI: 7693978 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 256) 

NCI-CGAP http : //www.ncbi . nlm.nih.gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Other_ESTs: ur53c01.xl 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Lothar Hennighausen Ph.D., Chu-Xia Deng Ph.D. 
cDNA Library Preparation: Life Technologies, Inc. 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

image . llnl . gov/image/html/iresources . shtml 



MGI: 1056740 

Seq primer: -40RP from Gibco 
High quality sequence stop: 161. 
FEATURES Location/Qualifiers 
source 1. .256 

/organism- "Mus musculus" 

/mol_type="mRNA" 

/strain="129, C57BL/6J, FVB/N" 

/db_xref="taxon: 10090" 

/clone="IMAGE: 3153984" 

/tissue_type=" tumor, gross tissue" 

/dev_stage="10 months" 

/lab_host="DH10B" 

/clone lib="NCI CGAP Mam3" 



/note= ,, Organ: mammary; Vector: pCMV-SP0RT6; Site_l : Sail; 
Site_2: NotI; Cloned unidirectionally . Primer: Oligo dT. 
Library constructed by Life Technologies. Investigators 
providing samples: Lothar Hennighausen/Chu-Xia Deng, NIH 
Reference for transgenic model: Xu et al . , Nature Genetics 
22, 37-43 (1999) 

BASE COUNT 61 a 62 c 71 g 62 t 

ORIGIN 



Query Match 17.3%; Score 171.8; . DB 9; Length 256; 

Best Local Similarity 93.7%; Pred. No. 5.2e-32; 

Matches 179; Conservative 0; Mismatches 12; Indels 0; Gaps 0; 

Qy 723 CAC C CT GT CAT C CT GGT C GGGGCACGCC C GGAAGT GCAAC GAGACAGC CAAGT C CT AT T G 782 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1 CACTCTGTCATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTG 60 

Qy 783 C GT CAAT GGAGGCGT CT GCT ACT ACAT C GAGGGC AT CAAC C AGCT CT C CTGCAAAT GT C C 842 

II I I I I I I I I I I I I II I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 T GT GAATGGAGGCGT GT GCT ACT ACAT CGAGGGCAT CAACCAG CT CT C CT GCAAAT GT GC 120 

Qy 843 AAATGGATTCTT CGGACAGAGATGTTT GGAGAAACT GCCTTTGCGATT GTACAT GCCAGA 902 

Ml I I I I II I II I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I II I I I II 
Db 121 AAACGGATT CT T CGGAC AGAGAT GTT T GGAGAAACT GC CTTT GC GAT T GTACAT GC CT GA 180 



Qy 903 T C CT AAGCAAA 913 

I I II I I I I I I I 
Db 181 TCCTAAGCAAA 191 



Search completed: January 14, 2004, 11:47:08 
Job time : 2580.42 sees 



GenCore version' 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



January 14, 2004, 07:12:21 ; Search time 3976 Seconds 

(without alignments) 
10227.407 Million cell updates/sec 

US-09-864-675-1 
994 

1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



2888711 seqs, 20454813386 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



5777422 



Database 



"GenEmbl:* 
1 : gk__ba : * 
2: gb_htg:* 
3: gb_in:* 
4 : gb_om : * 
5 : gb_o v : * 
6: gb_pat:* 
7 : gb_ph : * 
8 : gb_pl : * 
9: gb_pr:* 
10: gb_ro:* 
11: gb_sts: + 
12: gb_sy:* 
1 3 : gb_un : * 
1 4 : gb_vi : * 
15 : em_ba : * 
16: em_fun:* 
17: em_hum: + 
18: em__in:* 
19: em_mu : * 
2 0 : em_om : * 
21: em__or:* 
22: em_ov:* 
23: em_pat:* 
24: em_ph:* 
25: em_pl:* 
26: em_ro:* 
27: em sts:* 



28: 


em 


un: * 


29: 


em 


vi : * 


30: 


em 


htg hum:* 


31 


em 


htg inv:* 


32 


em 


htg other:* 


33 


em 


htg mus : * 


34 


em 


htg pin:* 


35 


em 


htg rod:* 


36 


: em 


htg mam:* 


37 


: em 


htg vrt:* 


38 


: em 


sy : * 


39 


: em 


htgo hum: * 


40 


: em 


htgo_mus : * 


41 


: em 


htgo other: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


913.2 


91 


9 


3020 


9 


AB005060 


AB005060 


Homo sapi 


2 


900 


90 


.5 


1884 


6 


AR098145 


AR098145 


Sequence 


3 


900 


90 


.5 


1884 


6 


AR116617 


AR116617 


Sequence 


4 


881 


88 


.6 


993 


6 


AR072053 


AR072053 


Sequence 


5 


800 


80 


.5 


2947 


10 


D89995 


D89995 Rattus sp . 


6 


799.4 


80 


.4 


3076 


6 


E16456 


E16456 Rat mRNA fo 


7 


799.4 


80 


.4 


3077 


10 


D89996 


D89996 Rattus sp. 


8 


738.6 


74 


.3 


3441 


6 


AR072052 


AR072052 


Sequence 


9 


547.2 


55 


.1 


1607 


6 


AR098144 


AR098144 


Sequence 


10 


547.2 


55 


.1 


1607 


6 


AR116616 


AR116616 


Sequence 


11 


492 


49 


.5 


1476 


6 


AR098146 


AR098146 


Sequence 


12 


492 


49 


.5 


1476 


6 


AR116618 


AR116618 


Sequence 


13 


492 


49 


.5 


2268 


6 


AR098155 


AR098155 


Sequence 


14 


492 


49 


.5 


2268 


6 


AR116627 


AR116627 


Sequence 


15 


487.4 


49 


.0 


2188 


10 


AB001576 


AB001576 Rattus sp 


16 


467.8 


47 


. 1 


2467 


6 


AR098143 


AR098143 


Sequence 


17 


467.8 


47 


. 1 


2467 


6 


AR116615 


AR116615 


Sequence 


18 


424.8 


42 


,1 


118504 


9 


AC09408 0 


AC094080 


Homo sapi 


19 


424.8 


42 


.7 


152838 


2 


AC011589 


AC011589 


Homo sapi 


20 


424.8 


42 


.7 


170797 


9 


AC011379 


AC011379 


Homo sapi 


21 


424.8 


42 


.7 


210675 


2 


AC026272 


AC026272 


Homo sapi 


22 


424 


42 


.7 


1054 


6 


AX406616 


AX406616 


Sequence 


23 


424 


42 


.7 


1054 


9 


HS2NRG01 


AF119151 


Homo sapi 


24 


387.2 


39 


.0 


140307 


2 


AC131191 


AC131191 


Mus muscu 


25 


384 


38 


.6 


253462 


2 


AC096477 


AC096477 


Rattus no 


26 


359.6 


36 


.2 


1207 


6 


AR072054 


AR072054 


Sequence 


27 


227.2 


22 


. 9 


240 


10 


AY227025 


AY227025 Mus muscu 


28 


173 


17 


.4 


419 


6 


AX406617 


AX406617 


Sequence 


29 


173 


17 


.4 


419 


9 


HS2NRG02 


AF119152 


Homo sapi 


30 


173 


17 


.4 


120236 


9 


AC008523 


AC008523 


Homo sapi 


31 


173 


17 


.4 


189050 


9 


AC008667 


AC008667 


Homo sapi 


32 


142.6 


14 


.3 


85703 


2 


AC020830 


AC020830 


Mus muscu 


33 


142.6 


14 


.3 


191101 


2 


AC127350 


AC127350 


Mus muscu 



34 


139. 


4 


14 


.0 


226038 


2 


AC106592 


AC106592 


Rattus no 


35 


139. 


4 


14 


.0 


273080 


2 


AC098540 


AC098540 


Rattus no 


36 


139. 


4 


14 


. 0 


302176 


2 


AC096479 


AC096479 


Rattus no 


37 


135. 


6 


13 


.6 


142 


6 


AR072064 


AR072064 


Sequence 


38 


124. 


6 


12 


.5 


493 


6 


AX406618 


AX406618 


Sequence 


39 


124. 


6 


12 


.5 


493 


9 


HS2NRG03 


AF119153 


Homo sapi 


40 


122. 


4 


12 


.3 


350 


6 


AX406619 


AX406619 


Sequence 


41 


122. 


4 


12 


.3 


350 


9 


HS2NRG04 


AF119154 


Homo sapi 


42 


108. 


4 


10 


.9 


206683 


2 


BX323592 


BX323592 


Danio rer 


43 


108. 


4 


10 


.9 


220700 


2 


BX005008 


BX005008 


Danio rer 


44 


108 


10 


.9 


85703 


2 


AC020830 


AC020830 


Mus muscu 


45 


95. 


4 


9 


.6 


1140 


6 


AR022498 


AR022498 


Sequence 



ALIGNMENTS 



RESULT 1 
AB005060 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
. MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



AB005060 3020 bp mRNA linear PRI 14-NOV-1997 

Homo sapiens mRNA for NTAK, complete cds . 

AB005060 

AB005060.1 GI:2626738 
NTAK. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Horrtinidae; Homo. 

1 (sites) 

Higashiyama, S . , Horikawa,M., Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 3020) 
Ishiguro,H. 

Direct Submission 

Submitted (24- JUN-1997 ) Hiroshi Ishiguro, Fujita Health University, 

ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 

(E-mail: hishi@fujita-hu.ac.jp, Tel : 0562-93-9393, Fax:0562-93-8831) 

Location/Qualifiers 

1. .3020 

/organism="Homo sapiens" 

/mol_type="mRNA M 

/db_xref="taxon:9606" 

/ eel 1_1 ine= " SK-N- SH " 

/ cell_type= " neurobla s toma " 

226. .2778 

/ codon_start=l 

/product="NTAK" 

/protein_id="BAA23417 .1" 

/db_xref="GI: 2626739" 

/translation="MRQVCCSALPPPPLEKGRCSSYSDSSSSSSERSSSSSSSSSESG 
SSSRSSSNNSSISRPAAPPEPRPQQQPQPRSPAARRAAARSRAAAAGGMRRDPAPGFS 



MLLFGVS LACYS PS LKS VQDQAYKAPVWEGKVQGLVPAGGS S SNSTREP PAS GRVAL 
VKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFAPLDTNG 
KNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKE 
LNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTT 
LSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 
D P KQ KAE E L YQ KRVLT I T G I C VAL LWG I VC WAYC KT KKQ RKQMHN H L RQNMC P AHQ 
NRSLANGPSHPRLDPEEIQMADYISKNVPATDHVI RRETETTFSGSHSCSPSHHCSTA 
TPTSSHRHESHTWSLERSESLTSDSQSGIMLSSVGTSKCNSPACVEARARR7VAAYNLE 
ERRRATAP P YHDS VDS LRDS PHS ERYVS ALTT PARLS PVDFH YS LATQVPT FEIT S PN 
SAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGPGPGPGPGPGADMQRSYDSYYYPAA 
GPGPRRGTCALGGSLGSLPASPFRIPEDDEYETTQECAPPPPPRPRARGASRRTSAGP 
RRWRRS RLNGLAAQRARAARDS LSLS SGS GGGSAS AS DDDADDADGALAAEST P FLGL 
RGAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRGPPPRAKQDSAPL" 
polyA_site 3020 

/note="39 A nucleotides" 

BASE COUNT 615 a 1015 c 937 g 453 t 

ORIGIN 

Query Match 91.9%; Score 913.2; DB 9; Length 3020; 

Best Local Similarity 99.7%; Pred. No. 4.8e-171; 

Matches 915; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 



Qy . 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 502 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 561 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 562 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 621 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I N I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I 

Db 622 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 681 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I II I I I I I I II I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 682 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 741 



Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I II I I I I I II I I I I I I I I I I I I I I II I I I I I I M I I I I I I M I I I I I I I I I I I I I I I I 

Db 742 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGT^AAGGAACCAG 801 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I | | | I I I I I I I I I I II I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 802 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 861 



Qy 361 CCCCTCGATACCAACGGCT^lAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 420 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 862 CCCCTCGATACCAACGGCAAAAATCTCAAGAT^AGAGGTGGGCAAGATCCTGTGCACTGAC 921 

Qy 421 T GC GC CAC C C GGC C CAAGTTGAAGAAGAT GAAGAGC C AGACGGGAC AGGTGGGT GAGAAG 480 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 922 TGC GCCACC CGGCC CAAGT T GAAGAAGAT GAAGAGC C AGACGGGAC AGGT GGGT GAGAAG 981 



QY 
Db 



481 
982 



540 
1041 



Qy 541 GAT GGCAAGGAGCT CAACC GCAGC C GAGAC AT T C GC AT CAAAT AT GGCAAC GGCAGAAAG 600 

I I I I II I I I I I I I I I I I I I I I I I I II ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > I 
Db 1042 GAT GGCAAGGAGCT CAACC GCAG C C GAGACAT T C GC AT CAAAT AT GGCAAC GGCAGAAAG 1101 

Qy 601 AACT CACGACTACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTATGT CT GCGAG 660 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I 
Db 1102 AACT CACGACTACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTATGT CT GCGAG 1161 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 1162 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 1221 

Qy 721 AC CAC C CT GT CAT C CT GGT C GGGGC AC GC C CGGAAGT GCAAC GAGAC AGC CAAGT CCT AT 780 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I 
Db 1222 AC CACC CT GT CAT CCT GGT C GGGGC AC GC C CGGAAGT GCAAC GAGACAGC CAAGT C CT AT 1281 

Qy 7 81 T GCGT CAAT GGAGGCGT CT GCT ACTACATCGAGGGCATCAACCAGCT CT CCT GCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 1282 T GCGT CAAT GGAGGC GT CTGCT ACT AC AT C GAGGGCATCAAC C AGCT CT C CT GCAAATGT 1341 

Qy 841 C CAAAT G GAT T CTT C GGAC AGAGAT GT TT GGAGAAACT GCCTTT GC GATT GT ACAT GCCA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 1342 C CAAAT GGAT T CT T C GGACAGAGAT GT TT GGAGAAACT GCCTTT GC GAT T GT ACAT GCCA 1401 

Qy 901 GAT C CT AAGC AAAGT GT C 918 

I I I I I I I I I I I I I I I 
Db 14 02 GAT C CTAAGCAAAAAGC C 1419 



RESULT 2 
AR098145 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR098145 1884 bp DNA linear 

Sequence 5 from patent US 6074841. 

AR098145 

AR098145. 1 GI: 12 807402 

Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 1884) 

Gearing, D. P. and Bus field, S . J . 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 5 13-JUN-2000; 

Location/ Qualifiers 

1. .1884 

/ organi sm= " unknown " 
426 a 607 c 560 g 291 t 



PAT 14-FEB-2001 



Query Match 90.5%; Score 900; DB 6; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 2e-168; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; 



Gaps 



1; 



Qy 

Db 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 



Qy 


61 


Db 


278 


Qy 


121 


Db 


338 


Qy 


181 


Db 


398 


Qy 


241 


Db 


458 


Qy 


301 


Db 


518 


Qy 


361 


Db 


578 


Qy 


421 


Db 


637 


Qy 


481 


Db 


697 


Qy 


541 


Db 


757 


Qy 


601 


Db 


817 


Qy 


661 


Db 


877 


Qy 


721 


Db 


937 


Qy 


781 


Db 


997 


Qy 


841 


Db 


1057 



TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I | | | | | | | I I I II I I I II I II I I I I I I I I I I I I I I I I I M I I I I I M I I I II I I I I I I I I 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I | | | | I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

CCCCT CGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCT GT GCACTGAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I II I 

CCCCT- GAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GGC 636 
T GC GC C AC CCGGC C CAAGTT GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAG 4 80 

II I I I I I I I I I I I I II I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I M M I I I I I 

T GC GC CAC CC GGCC CAAGTT GAAGAAGAT GAAGAGC CAGAC G GGAC AGGT GGGT GAGAAG 696 

CAAT C GCT GAAGT GT GAGGCAGC AGC C GGTAAT C C C CAGC CT T C CT ACCGT T GGTT CAAG 54 0 

I | I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

GATGGCAAGGAGCT CAACCGCAGCCGAGACATT CGCATCAAAT AT GGCAACGGCAGAAAG 600 

I | | | | | II I I I I II I I I I II I I I I I I II I I I I I I I I I I III I I I I I I I I I I I I I M I I I I 

GAT GGCAAGGAGCT CAACCGCAGCCGAGACATT CGCATCAAAT AT GGCAACGGCAGAAAG 816 

AACTCACGACTACAGTTCAACAAGGT GAAGGT GGAGGACGCT GGGGAGT ATGT CTGCGAG 660 

I | M | I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I II I I I I I I I I M I I I I I I I I 

AACTCACGACTACAGTT CAACAAGGTGAAGGTGGAGGACGCT GGGGAGT ATGT CT GCGAG 87 6 

GC C GAGAAC AT C CT GGGGAAGGACAC CGTCCGGGGCC GG CT T T AC GT CAACAG CGT GAGC 720 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I IN I I I I I I I I I I I I I I 

GC C GAGAACAT C CT GGGGAAGGACAC C GT CC G GGGC C GGCTTT AC GT CAAC AGC GT GAGC 936 

ACC AC C CT GTCAT CCT GGT CGGGGCACGCCC GGAAGT GCAAC GAGACAGCCAAGT C CT AT 78 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
AC CAC C CT GT CAT CCT G GT CGGG GC AC GC CC GGAAGT GCAAC GAGACAGCCAAGT C CT AT 996 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 
I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I II I I I I II I I I I I I I I I I I I I 
TGCGT CAATGGAGGCGT CT GCT ACTACAT CGAGGGCAT CAACCAGCT CT CCTGCAAAT GT 1056 

C CAAAT GGAT T CTT C GGACAGAGAT GT TT GGAGAAACT GC CT T T GCGATT GT AC AT GCC A 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I II 

C CAAAT GGATT CT T C GGACAGAGAT GTT T GGAGAAACT GC CTT T GC GATT GT AC AT GC CA 1116 



Qy 



Db 



901 GAT CCT AAGCAAAGT GT CCT 92 0 

I I I I I I I I I I II I Ml 
11.17 GATCCTAAGCAAAAGCACCT 1136 



RESULT 3 
AR116617 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
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JOURNAL 
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BASE COUNT 
ORIGIN 



AR116617 1884 bp DNA 

Sequence 5 from patent US 6133423. 
AR116617 

AR116617.1 GI : 14 096939 



Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 1884) 

Gearing, D. P. and Busf ield, S.J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6133423-A 5 17-OCT-2000; 

Location/Qualifiers 

1. .1884 

/ organism="unknown" 
426 a 607 c 560 g 291 t 



linear PAT 16-MAY-2001 



Query Match 90.5%; 
Best Local Similarity 99.3%; 
Matches 914; Conservative 



Score 900; DB 6; Length 1884; 
Pred. No. 2e-168; 
0; Mismatches 5; Indels 



1; Gaps 



l; 



Qy 



Db 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I [ I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I 

218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 



QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I II I I I I I II I I I I I I I I II I I I I I I I I I I I I II I M I I II I I I I I I I M I I I 

278 TACT C GCCCAGC CT CAAGTCAGT GC AGGAC CAGGC GT ACAAGGCAC C CGT G GT GGT GGAG 337 



121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 



180 



397 



240 



181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

I II I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I 

458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 



Qy 

Db 

Qy 

Db 



301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

361 CC C CT C GAT ACCAAC G GCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 42 0 

I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
57 8 CC C CT - GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CTGT GCACT GGC 636 



Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



421 T GC GC CAC C CGGC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAG 4 8 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
637 T GC GC CAC C CGGC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAG 696 

4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I II I II I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I 

697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

541 GAT GGCAAGGAGCT CAACC G C AGC C GAGACATT C GC AT CAAAT AT GGCAACGGCAGAAAG 600 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I II I I I I I I I II I I I I I I I I I I I I I 

757 GAT GGCAAGGAGCT CAAC C GC AGC C GAGACATT C GC ATCAAATAT GGCAAC GGCAGAAAG 816 

601 AACTCACGACTACAGTT CAACAAGGT GAAGGTGGAGGACGCTGGGGAGTAT GT CTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

817 AACTCACGACTACAGTT CAACAAGGT GAAGGTGGAGGACGCTGGGGAGTAT GTCTGCGAG 876 

661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

721 AC CAC C CT GTC AT C CT GGT C GGGGCACG CC C GGAAGT GCAAC GAGACAGC CAAGTCCT AT 780 

I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
937 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 996 

781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

841 CCAAATGGATT CTT CGGACAGAGAT GTTTGGAGAAACTGCCTTTGCGATT GTACATGCCA 900 
I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1057 C CAAAT GGATT CTT C GGACAGAGAT GTT T GGAGAAACT GC CT TT GCGATT GT ACAT GC C A 1116 



901 



920 



GAT CCTAAGCAAAGT GT CCT 

I II I I I I I I I I I I III 
1117 GATCCTAAGCAAAAGCACCT 1136 



RESULT 4 

AR072053 

LOCUS. 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR072053 993 bp DNA 

Sequence 3 from patent US 5912326. 
AR072053 

AR072053.1 GI:722294l' 

Unknown. 

Unknown . 

Unclassified. 

1 (bases 1 to 993) 

Chang, H. 

Cerebellum- derived growth factors 
Patent: US 5912326-A 3 15-JUN-1999; 

Location/Qualifiers 

1. .993 

/organism= "unknown" 
230 a 271 c 311 g 181 t 



linear 



PAT 18-FEB-2000 



Query Match 88.6%; Score 881; DB 6; Length 993; 

Best Local Similarity 93.0%; Pred. No. 1.2e-164; 

Matches 923; Conservative 0; Mismatches 70; Indels 0; Gaps 0; 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I II I I I I I I I 

Db 61 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 



Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 181 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I MINIMI 

Db 301 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 360 

Qy 361 C C C CT CGATAC CAAC GGCAAAAAT CT CAAGAAAGAG GT GGGCAAGAT C CT GT GCACT GAC 420 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 C CGGT CGACC CTAAC GGCAAAAACAT CAAGAAAGAG GT GGGCAAGAT C CT GT GCACT GAC 420 

Qy 421 T GCGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGTGAGAAG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 

Db .421 T GCGCAACCCGGCCCAAGCTGAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGCGAGAAG 480 



Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I II. I I I I I I I I I 

Db 4 81 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 54 0 

Qy 541 GAT GGCAAGGAGCT CAAC C GC AGC C GAGACAT T C GCAT CAAAT AT GGCAACGGC AGAAAG 600 

II I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 541 GACGGCAAGGAGCT CAAC C GGAGT C GT GAC ATT C GCAT CAAGT AT GGCAACGGC AGAAAG 600 

Qy 601 AACT CACGACTACAGTT CAACAAGGT GAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I III 
Db 601 AACT CACGGCT AC AGT T CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGT AC GT CT GT GAG 660 



Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I I I I I I I I I I I I I I I Ml I I I II I I I I I I II I I I I I I I I I I I I I I I I I 

Db 661 GCT GAGAACAT C CTT GGGAAGGAC ACT GT GAGGGGC C GG C T C CAT GT CAAC AGT GT GAGC 720 

Qy 721 AC C AC C CT GT CAT CCT GGT CGGGGC ACGC C C GGAAGT GCAAC GAGACAGCCAAGT C CT AT 780 

Mill I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAAT GAGACAGCCAAGT CCTAC 780 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 



II II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 7 81 TGTGTGAATGGAGGCGTGTGCTACTACATCGT^lGGCATCAACCAACTCTCCTGCAAATGT 84 0 

Qy 841 C CAAAT GGAT T CT T C G GACAGAGAT GT T T GGAGAAACT GC CTTT GCGAT T GT AC AT GCC A 900 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 C CAAAC GGAT T CT T C GGACAGAGAT GT T T G GAGAAACT GC CT T T GCGAT T GT AC AT GCC A 900 

Qy 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

Qy 961 T CAACTT CT C CAAGC AC CTT GGAT TT GAAT TAA 993 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 961 T CAACT T CT C CAAGC AC CT T G GAT T T GAAT TAA 993 



RESULT 5 

D89995 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 
FEATURES 

source 



CDS 



2947 bp mRNA linear 
mRNA for NTAK alphal, complete cds . 



Chordata; 
Rodent ia; 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; 



D89995 2947 bp mRNA linear ROD 07-FEB-1999 

Rattus sp. 
D89995 

D89995. 1 GI :2605629 
NTAK alphal. 
Rattus sp . 
Rattus sp. 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 
Rattus . 

1 (sites) 

Higashiyama, S . , Horikawa,M. , Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro, H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 2947) 
Ishiguro, H . 

Direct Submission 

Submitted (21-DEC-1996) Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 
(E-mail: hishi@fujita-hu.ac.jp, Tel : 0562-93-9393, Fax:0562-93-8831) 
Sequence updated (28-Feb-1997 ) by: Hiroshi Ishiguro. 

Location/Qualifiers 

1. .2947 

/organism="Rattus sp." 

/mol_type= ,, mRNA" 

/ db_xr e f - " t axon : 1 0 1 1 8 " 

/cell_line="PC12" 

/ cell__type="pheochromocytoma" 

79. .2685 

/ codon_start=l 

/product="NTAK alphal" 

/protein_id="BAA23344 . 1" 

/db_xref ="GI : 2605630" 

/translation="MRQVCCSALPPPLEKARCSSYSYSDSSSSSSSNNSSSSTSSRSS 



BASE COUNT 
ORIGIN 



SRSSSRSSRGSTTTTSSSENSGSNSGSIFRPAAPPEPRPQPQPQPRSPAARRAAARSR 
AAAAGGMRRDP APGS SMLL FGVS LAC YS P S LKS VQDQAYKAPVWEGKVQGLAPAGGS 
SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTE 
QPLVFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAA 
AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILG 
KDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 
FGQRCLEKLPLRLYMPDPKQKHLGFELKEAEELYQKRVLTITGICVALLWGIVCWA 
YCKTKKQRRQMHHHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVPATDHV 
IRREAETTFSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGIMLSSV 
GT S KCN S P AC VEARARRAAAY S Q E E RRRAAMP P YH D S I D S L RD S PH S ER YVS ALT T PA 
RLSPVDFHYSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGP 
GPGADMQRSYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQECAPP 
PPPRPRTRGASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSASASDDD 
ADDADGALAAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRG 
PPTRAKQDSGPL" 
665 a 945 c 895 g 442 t 



Query Match 80.5%; Score 800; DB 10; Length 2947; 

Best Local Similarity 91.8%; Pred. No. 1.4e-148; 

Matches 845; Conservative 0; Mismatches 75; Indels 0; 



Gaps 



0; 



Qy 


l 


Db 


4 03 


Qy 


£i 
o _L 


Db 


4 63 


Qy 


121 


Db 


523 


Qy 


181 


Db 


583 


Qy 


241 


Db 


643 


Qy 


301 


Db 


703 


Qy 


361 


Db 


763 


Qy 


421 


Db 


823 


Qy 


481 


Db 


883 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II 

ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 4 62 

T ACT CGC CC AGCCT CAAGT CAGT GC AG GAC C AGGCGT ACAAGG CAC C C GT GGT GGT GGAG 120 
I I I I I I I I I I I I I I I E I I II I I I I I I I I I I I I I I I I I I I j I I I I I I I I I I I I I I I I I I I 
TACT CGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGT GGAG 522 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

M I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I II I I I I I I I 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 582 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

111111111111111111111 I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 642 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 7 02 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I II I I I I I I I I I 

CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 762 

C CC CT C GAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CTGT GC ACT GAC 420 

M | | | | | I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I 

CC GGT C GAC C CTAAC GGCAAAAAC AT CAAGAAAGAGGT GG GCAAGAT C CTGT GC ACT GAC 822 

T GCGCCACCCGGCCCAAGTTGAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGT GAGAAG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II III I M -Ml 

T GCGCAACCCGGCCCAAGCTGAAGAAGAT GAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 882 

CAATC GCT GAAGT GT GAGGC AGC AGC CGGT AAT C C C CAGC CT T C CT AC C GT T GGTT CAAG 54 0 

II I II I I I I I I I I I I I I I II II II II I I I I I I I I I I M I II I I I I I I I I I 

CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 942 



Qy 541 GAT GGCAAGGAGCT CAACCGCAGCCGAGACATTCGCAT CAAAT AT GGCAACGGCAGAAAG 600 

II I I I I I I I I I II I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 943 GAC GGCAAGGAGCT CAACCGGAGT CGT GACATTCGCAT CAAGT AT GGCAACGGCAGAAAG 1002 

Qy 601 AACTCACGACT ACAGTT CAACAAGGTGAAGGT GGAGGACGCT GGGGAGT AT GTCTGCGAG 660 

I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 1003 AACT CACGGCTACAGTT CAACAAAGTGAAGGT GGAGGACGCTGGAGAGT ACGT CTGTGAG 1062 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I I I I I I I I I I I I I I I I I I I II II I I I I I M I I I I I I I I I I I I I I I I I I 

Db 1063 G CT GAGAACAT CCTT GGGAAGGAC ACT GT GAGGGGC C GGCTC CAT GT CAACAGT GT GAGC 1122 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I 
Db 1123 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1182 

Qy 7 81 T GC GTCAAT GGAGGC GT CT GCT ACT AC AT C GAGGGCAT CAAC C AGCT CT C CT GCAAAT GT 840 

I I I I I I I I I II I I II I I I I I I I I I I II I I I I I I I I I I II I I I I I I I II I I I II I I 
Db 1183 T GT GT GAAT GGAGGC GT GT GCT ACT AC AT C GAAGGC AT CAAC CAACT CT C CT GCAAAT GT 1242 

Qy 841 C CAAAT GGAT T CTT C GGACAGAGAT GTT T GGAGAAACT GC CT T T GC GATT GT ACAT GC C A 900 

I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1243 C CAAAC GGATT CT T C GGACAGAGAT GTT T GGAGAAACT GC CT T T GC GATT GT AC AT GC C A 1302 

Qy 901 GAT CCTAAG CAAAGT GT C CT 92 0 

I I I I I I I II I I I I III 
Db 1303 GATCCTAAGCAAAAGCACCT 1322 



RESULT 6 

E16456 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



E16456 3076 bp DNA linear PAT 28-JUL-1999 

Rat mRNA for neuregulin-like Transmembrane Activator for ErbB 
Kinases (NTAK) . 
E16456 

E16456.1 GI:5711139 
JP 1998179166-A/l. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 3076) 

Higashiyama, S . , Taniguchi, N. , Ishiguro,K. and Nagatsu,T. 

GENE ENCODING RECEPTOR TYPE TYROSINE-KINASE ERB B LIGAND AND ITS 

Patent: JP 1998179166-A 1 07-JUL-1998; 

HIGASHIYAMA SHIGEKI 

OS Rattus sp. (rat) - 

PN JP 1998179166-A/l 

PD 07-JUL-1998 

PF 25-DEC-1996 JP 1996356998 

PI HIGASHIYAMA SHIGEKI, TANIGUCHI NAOYUKI, ISHIGURO KEIJI, PI 
NAGATSU TOSHIHARU 

PC C12N15/09, C07K14/705, C07K16/28 , C12N5/10, C12N15/02 , C12P21/02 , 
PC C12P21/08, 

PC C12Ql/68,G01N33/53,G01N33/566//A61K31/70,A61K38/46,A61K39/395, 



PC A61K48/00, 

PC C07H21/04, (C12N5/10, C12R1: 91 
s trandednes s : Double ; 

topology: Linear; 

Key Location/Qualifiers 



(C12P21/02, C12R1:91) ; CC 



CC 
FH 
FH 
FT 
FT 
FT 
FT 
FT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



source 1. .3076 

/organism^ 1 Rattus sp. 1 
/cell_line='PC12' 
CDS 232. .2814 

/product= f NTAK protein 1 
Location/Qualifiers 
1. .3076 

/organism="Rattus sp." 
/mo l_type=" genomic DNA" 
/dfo_xref="taxon: 10118" 
673 a 996 c 944 g 463 t 



Query Match 80.4%; Score 799.4; DB 6; Length 3076; 

Best Local Similarity 92.2%; Pred. No. 1.8e-148; 

Matches 842; Conservative 0; Mismatches 71; Indels 0; Gaps 0; 



Qy 



Db 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I II II I I I II I I I I I I I I I I I I I I 

556 ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 615 



Qy 



Db 



61 TACT CGC C CAGC CT CAAGT CAGT G CAGGAC C AGGC GT ACAAGGC AC CC GT GGT GGT GGAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
616 T ACT C GC C CAGC CT CAAGT CCGTGCAGGACC AGGC GT ACAAGGC ACCCGT GGT GGT GGAG 675 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

67 6 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 735 



181 



240 



CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

736 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 795 



241 



300 



GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I 

796 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 855 

301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

856 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 915 

361 C CC CT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 420 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

916 C CGGT C GAC C CT AAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGATCCT GT GC ACT GAC 975 



421 



480 



T GC GCCAC C C GGC C CAAGTT GAAGAAGAT GAAGAGC CAGAC GG GACAGGT GGGT GAGAAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 
97 6 T GC GCAAC CC G GC C CAAGCT GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGC GAGAAG 1035 



Qy 



481 



CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
II I I I I I I I I I I I I I I I II II II II I II I I I I I I I I I I II I I I I I I I I I 



540 



Db 



1036 



CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 



1095 



Qy 541 GAT GGCAAGGAGCT CAACCGCAGCCGAGACATTCGCATCAAATAT GGCAACGGCAGAAAG 600 

II I I I I I I I I II I II I I I I II II I I I I I I II I I I I I I I I I I I I I I I M I I I I I II 

Db 1096 GAC GGCAAGGAGCT CAACC GGAGT C GT GACATT C GC AT CAAGT AT GGCAAC GGCAGAAAG 1155 

Qy 601 AACT C AC GACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 660 

I I I I I I I I I I I I I III I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1156 AACT C AC GGCT AC AGT T CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGT AC GT CT GT GAG 1215 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

II I I II I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1216 GCT GAGAACAT C CTT G GGAAG GACACT GT GAGGGGCC GGCT C CAT GT CAAC AGT GT GAGC 1275 

Qy 721 ACC AC C CT GT CAT CCT GGT CG GGGCACGC C C GGAAGT GCAAC GAGACAGC CAAGT C CTAT 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1276 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1335 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 1336 T GTGTGAAT GGAGGCGTGTGCTACT ACAT CGAAGGCAT CAAC CAACTCT CCT GCAAAT GT 1395 

Qy 841 CCAAAT GGATTCTTC GGACAGAGATGTTT GGAGAAACT GCCTTTGCGATT GT ACAT GCCA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1396 C CAAAC GGATT CTT C G GAC AGAGAT GTTT GGAGAAACT GC CTTTGC GATT GT AC AT GC CA 1455 

Qy 901 GAT C CTAAGCAAA 913 

I I I I I I I I I I I I I 
Db 1456 GATCCTAAGCAAA 1468 



RESULT 7 

D89996 

LOCUS 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
' ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



D89996 3077 bp mRNA linear ROD 07-FEB-1999 

Rattus sp. mRNA for NTAK alpha2 f complete cds . 

D89996 

D89996. 1 GI:2605631 
NTAK alpha2. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (sites) 

Higashiyama, S . , Horikawa,M., Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 ■ 

2 (bases 1 to 3077) 
Ishiguro,H. 

Direct Submission 

Submitted (21-DEC-1996) Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 



FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



(E-mail:hishi@fujita-hu. ac.jp, Tel : 0562-93-9393, Fax : 0562-93-8831 ) 
Location/Qualifiers 
1. .3077 

/organism="Rattus sp." 

/mol_type="mRNA" 

/db_xref="taxon: 10118" 

/cell_line="PC12" 

/ cell_type="pheochromocytoma" 

233. .2815 

/codon_start=l 

/ product="NTAK alpha2" 

Yprotein_id="BAA23345. 1" 

/db__xref="GI: 2605632" 

/translation="MRQVCCSALPPPLEKARCSSYSYSDSSSSSSSNNSSSSTSSRSS 
SRSSSRSSRGSTTTTSSSENSGSNSGSIFRPAAPPEPRPQPQPQPRSPAARRAAARSR 
AAAAGGMRRDPAPGS SMLL FGVS LAC YS P S LKS VQDQAYKAPVWEGKVQGLAP AGGS 
SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTE 
QPLVFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAA 
AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILG 
KDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 
FGQRCLEKLPLRLYMPDPKQKAEELYQKRVLTITGICVALLWGIVCWAYCKTKKQR 
RQ^HHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVPATDHVIRREAETT 
FSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGIMLSSVGTSKCNSP 
AC VEARARRAAAY S Q E E RRRAAM P P YH D S I DSLRDS PHS ERYVSALTT PARLS PVDFH 
YSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGPGPGADMQR 
SYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQECAPPPPPRPRTR 
GASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSASASDDDADDADGAL 
AAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRGPPTRAKQD 
SGPL" 

673 a 996 c 945 g 463 t 



Query Match 80.4%; Score 799.4; DB 10; Length 3077; 

Best Local Similarity 92.2%; Pred. No. 1.8e-148; 

Matches 842; Conservative 0; Mismatches 71; Indels 0; Gaps 



0; 



QY 



Db 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I II I I I I I I MINIMI 
557 ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 616 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 TACT C GCC C AGC CT CAAGT CAGT GC AGGAC CAG GCGT ACAAGGC AC C C GT GGT GGT GGAG 120 
I I I I I II I I I I II I II I I II I II I I I II I M II I I II I I I I I I II I I I I I II II II I II 
617 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 676 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I II II II II I I M I I I II III II Mill I I I I I I I I I II I II I II I I II I I I I 

677 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 736 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

II I I I I II I I I I II I I II II I I II I II I I I I I I I II I I II I II I I I I II I II I I I I II 

737 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 7 96 

2 41 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I II II I II I I I I I M II I I I II I II II I I II I I M I I II I I I II I I M II I I II I 

7 97 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 856 



Qy 



301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 



360 



857 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 916 

361 CC C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 420 

II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

917 CCGGT CGACC CTAACGGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT CCTGT GCACT GAC 97 6 



Qy 421 T G CGC C AC CC GGC CCAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGTGGGT GAGAAG 4 8 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IN I II I I I I I I I I I I 

Db 977 T GC GCAAC CC GGC C CAAGCT GAAGAAGAT GAAGAGT C AGACAGGAGAGGT GGGC GAGAAG 1036 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

II I II II I I I I I I I I I I I II I I I I II I I I II I I I I I I I I II I I I I I I I I I 

Db 1037 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 1096 

Qy 541 GATGGCAAGGAGCT CAACCGCAGCCGAGACATTCGCAT CAAATAT GGCAACGGCAGAAAG 600 

II I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 1097 GACGGCAAGGAGCT CAACCGGAGT CGT GACATT CGCATCAAGTAT GGCAACGGCAGAAAG 1156 

Qy 601 AACT CACGACTACAGTTCAACAAGGTGAAGGT GGAGGACGCTGGGGAGT ATGT CT GCGAG 660 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I III 

Db 1157 AACT C AC GGCT AC AGTT CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGT ACGT CT GT GAG 1216 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1217 GCT GAGAAC AT CCTT GGGAAGGACACT GT GAGGGGC C GGCT C CAT GT CAACAGT GT GAGC 1276 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1277 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1336 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

II || II I I I I I I I I I I I II I I I i I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1337 T GT GT GAAT GGAGGCGT GT GCT ACT AC AT C GAAGGCAT CAAC CAACT CT C CT GCAAAT GT 1396 

Qy 841 C CAAAT GGATT CT T C GGACAGAGAT GTT T GGAGAAACT GC CT TT GCGATT GTAC AT GC C A 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I M I 
Db 1397 CCAAAC GGAT TCT T CGGACAGAGAT GTTT GGAGAAACT GCCT T T GC GATT GTACAT GC CA 1456 

Qy 901 GATCCTAAGCAAA 913 

I I I I I II I I I I I I 
Db 1457 GAT CCT AAGCAAA 1469 



RESULT 8 
AR072052 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 



AR072052 
Sequence 
AR072052 
AR072052. 



3441 bp 
from patent US 5912326. 

GI:7222940 



DNA 



linear PAT 18-FEB-2000 



Unknown. 

Unknown . 

Unclassified. 

1 (bases 1 to 3441) 

Chang, H . 

Cerebellum-derived growth factors 



JOURNAL Patent: US 5912326-A 1 15- JUN-1999 ; 
FEATURES Location/Qualifiers 
source 1. .3441 

/ or ganism="un known" 
BASE COUNT 777 a 1057 c 1015 g 592 t 

ORIGIN 



Query Match 74.3%; Score 738.6; DB 6; Length 3441; 

Best Local Similarity 90.4%; Pred. No. 2.1e-136; 

Matches 789; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 180 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 

Qy 61 TACT C GC C C AGCCTCAAGT CAGT GCAGGAC C AGGC GT ACAAGGC ACC CGT GGT GGT GGAG 12 0 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 240 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 299 



Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I | | I I I I I I I I I II I I I I III II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 300 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 359 

Qy. 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I I I I I I I M I II I I I I I I MM I I II I I I I I I I I I I I I I I I I I I I I M I I I I I II 

Db 360 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 419 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I | M | I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 420 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 47 9 

Qy 301 CG CT ACAT CT T T T T CCT GGAGC C CACGGAACAGC C CTT AGT CT T TAAGAC GGC CT T T GC C 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I MINIMI 
Db 4 80 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 539 



Qy 361 CC C CT C GAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GAC 42 0 

M | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 540 C C GGT C GAC CCTAACGGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 599 

Qy 421 TGCGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGTGAGAAG 480 

I I I I I II M M M I II I I I I I I I I I I II I I I I I I I I I III I I I I I I I I I I I I I 
Db 600 T GCGCAACCCGGCCCAAGCTGAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGCGAGAAG 659 



Qy 4 81 CAAT C GCT GAAGT GTGAGGC AGCAGCC G GT AAT C C C C AGC CT TCCT AC CGT T GGTT CAAG 54 0 

II I I I I I I I I II I I I I I I II II II II I I I I I I I I I I I I I M I I I I I I I I I 

Db 660 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 719 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

II I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 720 GAC GGCAAGGAGCT CAAC C GGAGT C GTGAC ATT C GCAT CAAGT AT GGCAACGGCAGAAAG 77 9 

Qy 601 AACT C AC GACT ACAGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CTGCGAG 660 

MINIM M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 7 80 AACT CACGGCTACAGTT CAACAAAGT GAAGGT GGAGGACGCT GGAGAGTACGT CTGT GAG 839 



Qy 



661 



GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
II I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 



720 



Db 



840 GCT GAGAACAT CCTTGGGAAGGAC ACT GT GAGGGGC C GGCT C CAT GT CAACAGT GT GAGC 899 



Qy 721 AC C ACC CT GT CAT C CT GGT CGGGGC ACGC C C GGAAGT GCAAC GAGACAGC CAAGT C CT AT 780 

I I I I I II I I I II M I I II I I I II I I I II I I I I I I I I I M I I I I M I I I II II II I I 

Db 900 AC CACT CT GT C GT C CTGGT C GGG GC ACGC C C G GAAGT G CAAT GAGACAGC CAAGT C CT AC 959 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 960 T GT GTGAAT GGAGGC GT GT GCT ACT ACAT C GAAGGCAT CAAC CAACT CT C CTGCAAGT GT 1019 

Qy 841 C CAAAT GGAT T CT T C GGACAGAGAT GTTT GGAG 873 

II III I III I II Ml I I 

Db 102 0 C CT GT GGGAT AC AC C GGGGAC AGGT GT CAGCAG 1052 



RESULT 9 
AR098144 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR098144 1607 bp DNA 

Sequence 3 from patent US 6074841. 
AR098144 

AR098144.1 GI:12807401 



Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 1607) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 3 13-JUN-2000; 

Location/Qualifiers 

1. .1607 

/organism-"unknown" 
365 a 500 c 480 g 262 t 



linear PAT 14-FEB-2001 



Query Match 55.1%; Score 547.2; DB 6; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 2e-98; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; 



Gaps 



0; 



Qy 



Db 



371 C CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GCGCCAC C C 430 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I Ml I I I I I I I I 

2 CTAACGGCAAAAAC AT CAAGAAAGAGGT GGGCAAGAT C CT GTGCACT GACT GC GCCAC C C 61 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



431 GGC C CAAGTT GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAGCAAT CGCT GA 4 90 
M I I I I I I I I I II I I I I I I I I I I I I I I I I I Ml I I II II I I II M I M I I I I I I I 
62 GGCC CAAGCT GAAGAAGAT GAAGAG C C AGACAGGAGAGGT G GGTGAGAAGCAGT CGCT CA 121 

491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I M II II I I I I I I II II I I I M II II I II I I I I I I I I II II 

122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

551 AGCT CAACCGCAGCCGAGACATTCGCATCAAAT AT GGCAACGGCAGAAAGAACT CACGAC 610 

I M I II I I I II I I II I I I I I I II I II I I I I I I II I II I II I I I II I I I II I 
182 AACTCAACCGGAGT CGTGATATT CGCAT CAAGT AT GGCAAT GT CAGAAAGAACT CACGGC 241 

611 T AC AGT T CAACAAG GT GAAGGT GGAGGAC GCTGGGGAGT AT GT CT GCGAGGC C GAGAACA 670 
I II I II I I I I II I I II I lllllllll.il II I I II II I I I II I I I I II I II I I II 



Db 



242 



T ACAGT T CAACAAAGT GAGGGT GGAGGAT GC C GGGGAGT ACGT CT GT GAGGC C GAGAAC A 



301 



Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I 

Db 302 T C CT T GGGAAGGACAC C GT GAGGGGC CGACT C CAT GT CAACAG C GT GAGC AC C ACT CT GT 361 

Qy 731 CAT C CT GGT C GGGGC AC GC CC GGAAGT GCAAC GAGACAGC CAAGT C CT AT T GCGT CAAT G 7 90 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I II II II I I II 
Db 362 CAT C CT GGT C GG GAC AT GC C C GGAAGT GCAAT GAGACCGC CAAGT CCT ACT GT GT GAAT G 421 

Qy 791 GAGGC GT CT GCT ACT ACAT C GAGGGCAT CAAC C AGCT CT C CT GCAAAT GT CCAAATGGAT 850 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 422 GAGGC GT GT GCT ACT AC AT C GAGGGCAT CAAC CAGCT CT C CT GCAAATGT C CAAACGGAT 481 

Qy 851 T CTT C GGACAGAGAT GTTT GGAGAAACT GC CT T T GC GATT GT AC AT GCC AGAT CCTAAGC 910 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 482 T CTT C G GACAGAGAT GTTT GGAGAAACT GC CT T T GC GAT T GT AC AT GCCAGAT C CTAAGC 541 

Qy 911 AAAGTGT CCT GT GGGATACACCGGGGACAGGTGTCAGCAGTT C GCAAT GGT CAACTTCT C 97 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 542 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 601 

Qy 971 CAAGCAC CTT GGAT TT GAAT T AAA 994 

I I I I I I I I I I I I I I I I I I I I I I I 

Db 602 CAAGCAC CTT GGAT T T GAAT T GAA 625 



RESULT 10 
■AR116616 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR116616 1607 bp DNA linear 

Sequence 3 from patent US 6133423. 

AR116616 

AR116616.1 GI: 14 096938 

Unknown . 

Unknown. 

Unclassified. 

1 (bases 1 to 1607) 

Gearing,D.P. and Busf ield, S.J. 

Don-lgene and polypeptides and uses therefor 
Patent: US 6133423-A 3 17-OCT-2000; 

Location/Qualifiers 

1. .1607 

/ organi sm= "unknown " 
365 a 500 c 480 g 262 t 



PAT 16-MAY-2001 



Query Match 55.1%; 
Best Local Similarity 92.3%; 
Matches 576; Conservative 



Score 547.2; DB 6; 
Pred. No. 2e-98; 
0; Mismatches 48; 



Length 1607; 



Indels 



0; Gaps 



0; 



QY 



Db 



371 C CAAC GGCAAAAAT CT CAAGAAAGAGGT GG GCAAGAT C CT GT GC ACT GACT GCGC CACCC 430 
I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II II I I I I I I II I I I I I M I I I I I I 
2 CTAAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CT GT G C ACT GACT GC GC CACCC 61 



QY 



431 GG C C CAAGTTGAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAGCAAT C GCT GA 4 90 
I I I I I I I I I I I II I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I 



Db 



62 GGC C CAAGCT GAAGAAGAT GAAGAGC C AGAC AGGAGAG GT GGGT GAGAAGCAGT C GCT C A 121 



Qy 4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I II I I I I I II I II II II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCT CAACC GC AGC C GAGACAT T C GC AT CAAAT AT GGCAAC GGC AGAAAGAACT C AC GAC 610 

I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 182 AACT CAACC GGAGT C GT GAT AT T C GCAT CAAGT AT G GCAAT GT C AGAAAGAACT CACGGC 241 

Qy 611 T AC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC C GAGAACA 670 

I I I II I I I I I I I I I I I I MINIMI II II II II I I II I I I II II I I II I II I I 
Db 242 TACAGTT CAACAAAGT GAGGGTGGAGGAT GCCGGGGAGTACGTCT GT GAGGC C GAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I II I I I II II II I I I II I I I I I II I I I I I II I II I II I II II I II I I II 

Db 302 T C CT T GGGAAGGACACC GT GAGGGGC CGACT C CAT GT CAACAGC GT GAGCAC CACT CT GT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I I I II I II I I I I I II I I I I I I I II II I II II I II I II II II II II II II I II I 

Db 362 CAT C CT G GT C GGGACAT GC CCGGAAGTGCAAT GAGAC C GCCAAGT C CT ACT GT GT GAAT G 421 

Qy 791 GAGGCGT CTGCTACTACATCGAGGGCAT CAACCAGCT CT CCTGCAAAT GT CCAAATGGAT 850 

I II I I II II I I II I I I II I II I I II II I I II II I I I I I II II II I II I II I I II II I I 

Db 422 GAGG C GT GT G CTACT AC AT CGAGGGCAT CAAC C AGCT CT C CT GCAAAT GT C CAAACGGAT 481 

Qy 851 T CT T C GGAC AGAGAT GT T T GGAGAAACT GC CTTT GCGAT T GT AC AT GC CAGAT C CT AAGC 910 

II II I I I II I I II II I II I I I I II II I II II I II I I II I II II I II II II I M II I II II 

Db 4 82 T CT T C GGAC AGAGAT GTT T GGAGAAACT GC CT TT GC GATT GT AC AT GC CAGAT C CTAAGC 541 

Qy 911 AAAGT GTCCTGT GGGATACACCGGGGACAGGT GT CAGCAGTTCGCAAT GGTCAACTT CT C 970 

II I I I II I M I II I I I I II I II II I II II I II II I I II I II I I II I I I II II I I I I II I I 
Db 542 AAAGT GT C CT GT G GGAT AC ACC GG GGAC AGGT GT CAGC AGT T C GCAAT GGT CAACTT CT C 601 

Qy 971 CAAGCACCTTGGATTTGAATTAAA 994 

IIIIIMIIIIIIIIIIIIEE.il 
Db 602 CAAGC ACCTT GGATTT GAATT GAA 625 



RESULT 11 

AR098146 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 



AR098146 1476 bp DNA linear 

Sequence 7 from patent US 6074841. 

AR098146 

AR09814 6. 1 GI: 128 07 403 

Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 1476) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 7 13-JUN-2000; 

Location/Qualifiers 

1. .1476 

/ organism=" unknown" 
335 a 473 c 452 g 216 t 



PAT 14-FEB-2001 



ORIGIN 



Query Match 4 9.5%; 

Best Local Similarity 92.8%; 
Matches 516; Conservative 



Score 492; DB 6; Length 1476; 
Pred. No. 1.7e-87; 
0; Mismatches 40; Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



363 C CTCGAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GACT G 422 
II II I II III I II I II I I II I I II 

98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

423 C GC CAC C C GGCC CAAGTT GAAGAAGAT GAAGAGC CAGACGGGAC AGGT GGGT GAGAAGCA 482 

II I M I I I I I I I I I I I I I I I I II I I I II I I I I I I I II I M I I II M I I I I I I I I I II II 
158 AGC CACC CGGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGCA 217 

4 83 AT CGCT GAAGT GT GAGGCAGC AGC C GGTAAT C C C C AGC CTT C CT ACC GTT GGT T CAAGGA 542 

MINIM II II II I I I I I I I II I I I I I I M I I I I I I I II II I M I I I I I I I 

218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

543 T GGCAAGGAGCT CAACCGC AG C CGAGAC ATT C GCAT CAAAT AT GGCAAC GGCAGAAAGAA 602 

I I I I II I I II I I I I I I I I I I I I I I II II I I M I I I I I I I I I I I I I I I I I I I I M I I I I II 
278 T GGCAAG GAGCT CAACCGCAGCC GAGAC AT T C GC AT CAAAT AT GGCAAC GGCAGAAAGAA 337 

603 CT CACGACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC 662 

I I II I M I I I I II I II I I I I I I I I II I I I I I I I I I M I I M II I I I I I I I I I I I I I I I I I 
338 CTCACGACT ACAGTTCAACAAGGT GAAGGT GGAGGACGCT GGGGAGT AT GT CT GCGAGGC 397 

663 C GAGAACAT CCT GGGGAAGGACAC CGTCCGGGGCC GGCT TT AC GT CAAC AGC GT GAGCAC 722 

I II I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I M M I I I II II I I M I I I I M I I I I 

398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGT GAGCAC 457 

723 CAC C CTGT CAT C CT GGTCGGGGCACGC C CGGAAGT GCAAC GAGAC AG C CAAGT CCT AT T G 782 

II | | | | | I I M I I M I II I I I I I I II I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I 

458 C AC CCTGT CAT C CT GGT CGGGGC AC GCC CGGAAGT GCAAC GAGACAGC CAAGT CCT AT T G 517 

783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I II I I I I I I M I I I II I II I I I I II III II II M M I I I I I I I I I I I I I I I I I M I I I I I 

518 C GT CAAT GGAGGC GT CT GCT ACT ACAT CGAGGGC AT CAACCAGCT CT CCT GCAAAT GT C C 577 

843 AAAT G GAT T CTT C GGACAGAGAT GTT T GGAGAAACT GC CT TT GC GATT GT ACATGC C AGA 902 

I I I I I I I II I I I II I I I II I II I I I I I M I M I I I I I I M I I I I M I I I I II I I I M I I I 
57 8 AAAT GGAT T CTT C GGACAGAGAT GT TT GGAGAAACT GC CTTT GC GAT T GT AC ATGC CAGA 637 



Qy 



Db 



903 TCCTAAGCAAAGTGTC 918 

I I I I I I II I I I II 

638 TCCTAAGCAAAAAGCC 653 



RESULT 12 

AR116618 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



AR116618 
Sequence 
AR116618 
AR116618. 



from patent 
GI:14096940 



1476 bp 
US 6133423. 



DNA 



linear PAT 16-MAY-2001 



Unknown. 
Unknown . 
Unclassified. 



REFERENCE 1 (bases 1 to 1476) 

AUTHORS Gearing, D . P . and Bus field, S.J. 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6133423-A 7 17-OCT-2000; 
FEATURES Location/Qualifiers 
source 1. .1476 

/organism="unknown" 
BASE COUNT 335 a 473 c 452 g 216 t 

ORIGIN 

Query Match 49.5%; Score 492; DB 6; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 1.7e-87; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

C CTC GAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GACTG 422 

II II I I! Ill I III I I I I I I I I M 

CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

C G CCAC C C GGCC CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAGCA 482 

I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I 

AGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGT GAGAAGCA 217 

ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 
I I I I I I I I I I I I I M I I I II I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I II II I 
ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

T GGCAAGGAGCT CAAC C GCAGCCGAGACAT T CGCAT CAAAT AT GGCAAC GGC AGAAAGAA 602 
I I I I I I I I E I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I M I I I I I I 
TGGCAAGGAGCT CAACCGCAGCCGAGACATTCGCAT CAAAT ATGGCAACGGCAGAAAGAA 337 

CT CACGACTACAGTTCAACAAGGT GAAGGT GGAGGACGCTGGGGAGTATGT CT GCGAGGC 662 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGT ATGT CT GC GAGGC 397 

CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
C GAGAACAT CCT GGGGAAGGAC AC CGT CCGGGGCC GGCT T T ACGT CAACAGCGT GAGCAC 457 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 
I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CAC C CT GT CAT C CT GGT C GGGGC AC GCC C GGAAGT GCAAC GAGACAGC CAAGT CCT AT T G 517 

C GT CAAT GGAGGCGT CT GCT ACT ACAT C GAGGGC AT CAAC C AGCT CT C CT GCAAAT GT C C 842 

I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

CGT CAAT G GAGGC GTCTGCTACTACATCGAGGGCAT CAAC CAGCTCT CCT GCAAAT GTCC 577 
AAAT GGAT T CTT C GGAC AGAGAT GT T T GGAGAAACT GCCT TT G CGAT T GT ACAT GC C AGA 902 

I I I I I I M I Ml I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

AAAT GGAT T CT T C GGAC AGAGAT GT T TGGAGAAACT GC CT T T GCGAT T GT ACAT GC C AGA 637 

TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I II 

TCCTAAGCAAAAAGCC 653 



Qy 


joj 


Db 


98 


Qy 




Db 


158 


Qy 




Db 


218 


Qy 


040 


Db 


O "~t o 

Z /o 


Qy 


DUj 


Db 


338 


Qy 


663 


Db 


398 


Qy 


723 


Db 


458 


Qy 


783 


Db 


518 


Qy 


843 


Db 


578 


Qy 


903 


Db 


638 



RESULT 13 
AR098155 



LOCUS AR098155 2268 bp DNA linear PAT 14-FEB-2001 

DEFINITION Sequence 31 from patent US 6074841. 
ACCESSION AR098155 

VERSION AR098155.1 GI:12807412 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 2268) 

AUTHORS Gearing, D. P. and Busf ield, S.J. 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6074841-A 31 13-JUN-2000; 
FEATURES Location/Qualifiers 
source 1 . . 2268 

/ o rgani sm= "unknown " 
BASE COUNT 502 a 734 c 701 g 331 t 

ORIGIN 

Query Match 49.5%; Score 492; DB 6; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 1.7e-87; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C CT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GACT G 422 

I I I I I I I III I I I I I I I I II I III 

Db 98 CCGC G GCAAGAAGCAC C CAGAGGGGAGGAAGC GGGAGAGGGAGCC CGAT C C CGGGGAGAA 157 

Qy 423 CGC CAC C C GGCC CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGTGAGAAGC A 482 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II 
Db 158 AGC CACC C G GCC CAAGT T GAAGAAGAT GAAGAGC CAGACGGGAC AGGT GGGT GAGAAGG A 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 T GGCAAGGAGCT CAAC CGCAGC C GAGACAT T C GC AT CAAAT AT GGCAACGGCAGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I 
Db 278 T GGCAAGGAGCT CAAC C GCAGC CGAGACATT CGCAT CAAAT AT GGCAAC GGCAGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I II I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I 
Db 338 CT CAC GACT ACAGT T CAACAAGGT GAAGGT GGAG GAC GCT GGGGAGT AT GT CTGC GAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 398 C GAGAACAT C CTGGGGAAGGACAC C GT CC GGGGC C GGCT T T AC GTCAAC AGC GT GAGCAC 457 

Qy 723 C ACCCT GT CAT CCT GGT C GGGGCACGC CC GGAAGT GCAAC GAGACAGC CAAGT C CT AT T G 782 

I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 458 CACCCT GT CAT CCT GGT C GGGGC AC GCC CGGAAGT GCAAC GAGACAGC CAAGT CCT AT T G 517 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAATGGAT T CT T CGGACAGAGAT GTT T GGAGAAACT GC CT T T GC GATT GT ACAT GC C AGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I 
Db 57 8 AAAT GGAT T CT TCGGAC AGAGAT GT TT GGAGAAACT GC CT TT GC GAT T GT AC AT GC C AGA 637 



Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 14 

AR116627 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR116627 2268 bp DNA linear 

Sequence 31 from patent US 6133423. 

AR116627 

AR116627.1 GI:14096949 

Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 2268) 

Gearing, D. P. and Busf ield, S.J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6133423-A 31 17-OCT-2000; 

Location/Qualifiers 

1. .2268 

/ organism— "unknown" 
502 a 734 c 701 g 331 t 



PAT 16-MAY-2001 



Query Match 4 9.5%; 

Best Local Similarity 92.8%; 
Matches 516; Conservative 



Score 4 92; DB 6; Length 22 68; 
Pred. No. 1.7e-87; 
0; Mismatches 40; Indels 0; 



Gaps 



0; 



Qy 



Db 



363 CCT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT G 422 
11.111 II I II I I I I I I I I II I III 

98 C C GC GGCAAGAAGCAC C CAGAGGGGAGGAAGC GGGAGAGGGAGC C C GAT C C C GGGGAGAA 157 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



423 



158 



483 



CGCCACC CGGCCCAAGTTGAAGAAGAT GAAGAGCCAGACGGjGACAGGT GGGTGAGAAGCA 
I I I I I I M I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I 
AGC CACC CGGCCCAAGTTGAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGC A 



482 



217 



542 



ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 
I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

543 T GGCAAGGAGCT CAAC CGCAGCC GAGAC ATT C GC AT CAAAT AT GGCAACG GCAGAAAGAA 602 

I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I M I I II I I I I I I I I I I I I I I I I I I II 
27 8 T GGCAAGGAGCT CAACC GC AGC C GAGACATTCGCAT CAAAT AT GGCAAC GGCAGAAAGAA 337 

603 CT CACGACTACAGTT CAACAAGGTGAAGGTGGAGGACGCT GGGGAGTATGT CT GCGAGGC 662 

I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
338 CT CACGACTACAGTT CAACAAGGTGAAGGTGGAGGACGCT GGGGAGTATGT CT GCGAGGC 397 

663 C GAGAAC AT CCT GGG GAAGGACAC CGT CC GGGGC C GGCTT T ACGT CAACAGC GT GAGCAC 722 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I M I I I I I I I I I II 
398 C GAGAAC AT CCT GG GGAAGGAC AC C GT C C GGGGC C GGCT T T ACGT CAAC AGCGTGAGCAC 457 

723 CAC C CT GT CAT C CT GGT C GGGGCAC GCC CGGAAGT GCAAC GAGACAGC CAAGT CCT AT T G 782 

I I I I I I I I I I I I I I I M II I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I M 

458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 



Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCAT.CAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT GGAT T CT T CGGAC AGAGAT GT T TGGAGAAACT GCCTTT GC GAT TGT ACAT GC C AGA 902 

II I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 578 AAAT GGAT T CT T CGGACAGAGAT GT T T GGAGAAACT GC CTTT GC GAT TGT ACAT GC C AGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 15 

AB001576 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



AB001576 2188 bp mRNA linear ROD 13-FEB-1999 

Rattus sp. mRNA for NTAK alpha2-lp, partial cds . 

AB001576 

AB001576. 1 GI : 2605478 

neural- and thymus -derived activator for ErbB kinases. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (sites) 

Higashiyama, S . , Horikawa,M., Yamada,K., Ichino,N., Nakano,N., 
Nakagawa,T., Miyagawa,J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 ' 

9348101 

2 (bases 1 to 2188) 
Ishiguro, H. 

Direct Submission 
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Location/Qualifiers . 

1. .2188 

/organism="Rattus sp." 
/mol_type= ,, mRNA" 
/db_xref="taxon: 10118" 
/cell__line="PC12" 
/cell_type="pheochromocytoma" 
<1. .1926 

/note="neural- and thymus -derived activator for ErbB 

kinases (NTAK) ; a member of the epidermal growth factor 

(EGF) family" 

/codon_start=l 

/product="NTAK alpha2-lp" 

/protein_id="BAA23348 . 1" 

/db_xref="GI: 2605479" 

/trans la tion="FFFFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTG 



EVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVED 
AGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIE 
GINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKAEELYQKRVLTITGICVALLWGI 
VCWAYCKTKKQRRQMHHHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVP 
ATDHVI RREAETTFSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGI 
MLSSVGTSKCNSPACVEARARRAAAYSQEERRRAAMPPYHDSIDSLRDSPHSERYVSA 
LTTPARLSPVDFHYSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAP 
PGPGPGPGADMQRSYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQ 
ECAPPPPPRPRTRGASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSAS 
ASDDDADDADGALAAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSH5TRASS 
RHSRGPPTRAKQDSGPL" 

BASE COUNT 515 a 674 c 643 g 356 t 

ORIGIN 

Query Match 49.0%; Score 487.4; DB 10; Length 2188; 

Best Local Similarity 90.3%; Pred. No. 1.4e-86; 

Matches 521; Conservative 0; Mismatches 56; Indels 0; Gaps 0; 

Qy 337 T T AGT CT TTAAGACGGC CTTTGCCCCCCTC GAT ACCAACGGCAAAAAT CT CAAGAAAGAG 396 

II I I I I I I I I I I I I I I I I I I I I MM I I I I II I I I I I I I M I I I I I I I I 

Db 4 TTTTTTTTTAAGACAGCCTTTGCCCCGGTCGACCCTAACGGCAAAAACAT CAAGAAAGAG 63 

Qy 397 GTGGGCAAGAT C CTGT G C ACT GACT GCGC CAC CC GGC C CAAGTT GAAGAAGAT GAAGAGC 456 

I | | I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 64 GT GGGCAAGAT C CT GT GCACT GACT GCGCAAC CC GGC C CAAG CT GAAGAAGAT GAAGAGT 123 

Qy 457 C AGACGGGACAGGT GGGT GAGAAGCAAT CGCT GAAGT GT GAGGC AGCAGC C GGT AAT CC C 516 

I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II MM II III 
Db 124 CAGACAGGAGAGGTGGGCGAGAAGCAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCC 18 3 

Qy 517 CAGC CT T CCT AC C GT T GGT T CAAGGAT G GCAAGGAGCT CAAC C GCAGC C GAGACAT T C GC 576 

I I I I I I I I I I II I I II I I I I I I I I I II I I I I I I I I I I I I I II II I I I I I I I I I 
Db 184 CAGC C CT CCT AT CGAT GGT T CAAGGACGGCAAGGAGCT CAAC C GGAGT C GT GACATT C GC 243 

Qy 577 ATCAAAT AT GGCAACGGCAGAAAGAACTCACGACTACAGTT CAACAAGGTGAAGGT GGAG 636 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 244 ATCAAGTATGGCAACGGCAGAAAGAACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAG 303 

Qy 637 GACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGC 696 

I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 304 GAC GCT GGAGAGT ACGT CT GT GAGGCT GAGAACAT C CTT GGGAAGGAC ACT GT GAGGGGC 363 

Qy 697 CGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAG 756 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 364 CGGCTCCATGTCAACAGTGTGAGCACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAG 42 3 

Qy 757 T GCAAC GAGAC AGC CAAGT C CT AT T GCGT CAAT GGAGGC GT CT GCT ACT ACAT C GAGGGC 816 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I III 
Db 424 TGCAAT GAGAC AGCCAAGT CCTACTGTGT GAAT GGAGGC GT GT GCT ACT ACAT CGAAGGC 483 

Qy 817 AT CAAC C AGCT CT C CT GCAAAT GT C CAAAT GGATT CT T C GGACAGAGAT GT TT GGAGAAA 87 6 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 484 AT CAAC CAACT CT CCTGCAAATGT CCAAAC GGATT CT T C GGACAGAGAT GT T T GGAGAAA 543 

Qy 877 CT GC CTT T GC GAT T GT ACAT GCCAGAT C CT AAGCAAA 913 

I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 544 CTGCCTTTGCGATTGTACATGCCAGATCCTAAGCAAA 580 



Search completed: January 14, 2004, 10:22:56 
Job time : 3984.5 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: January 14, 2004, 07:12:21 ; Search time 321.696 Seconds 

(without alignments) 
8340.911 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-864-675-1 
994 

1 atgaggcgcgacccggcccc caccttggatttgaattaaa 994 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 2552756 seqs, 1349719017 residues 

Total number of hits satisfying chosen parameters: 



5105512 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : N_Geneseq_19 Jun03 : * 

1: /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1980.DAT: * 
2 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1981 . DAT: * 
3 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1982 . DAT : * 
4 : /SIDSl/gcgdata/ geneseq/ geneseqn-embl/NAl983 . DAT : * 
5: /SIDS1/ gcgdata/ geneseq/ geneseqn-embl/NA1984 . DAT : * 
6: /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1985 . DAT: * 
7: /SIDSl/gcgdata/geneseq/geneseqn-embl/NAl986.DAT: * 
8 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1987 . DAT: * 
9: /SIDSl/gcgdata/ geneseq/ geneseqn-embl/NA198 8 . DAT: * 
10 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1989 . DAT: * 
11 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1990 . DAT: * 
12 : / SIDSl/ gcgdata/ geneseq/ geneseqn-embl/NA1991 . DAT : * 
13 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NAl992 . DAT: * 
14 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NAl993 . DAT: * 
15: /SIDSl/ gcgdata/ geneseq/ geneseqn-embl/NA1994 . DAT : * 
16: /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1995 . DAT: * 
17 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NAl996.DAT: * 
18 : /SIDSl/ gcgdata/ geneseq/ geneseqn-embl/NA1997 . DAT : * 
19: / SIDSl/ gcgdata/ geneseq/ geneseqn-embl/NA1998 . DAT : * 
20 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA1999 . DAT: * 
21 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA2000 . DAT : * 
22 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA2001A. DAT : * 
23: /SIDSl/ gcgdata/geneseq/ geneseqn-embl/NA2001B . DAT : * 
24 : /SIDSl/ gcgdata/ geneseq/ genes eqn-embl/NA2 002 . DAT : * 
25 : /SIDSl/gcgdata/geneseq/geneseqn-embl/NA2003 . DAT: * 



Pred. No. is the number of results predicted by chance to have a 



score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


994 


100 . 


o 


994 


24 


AAS18019 


Human cDNA encodin 


2 


900 


90. 


5 


1884 


19 


AAV17814 


Homo sapiens don-1 


3 


882 


88 . 


7 


1803 


18 


AAT87923 


Rat cerebellum der 


4 


849 


85 . 


4 


897 


24 


AAS18020 


Human cDNA encodin 


5 


799.4 


80 . 


4 


3076 


19 


AAV43674 


Receptor type tyro 


6 


788 


79 . 


3 


1863 


25 


ABS56035 


cDNA encoding huma 


7 


738.6 


74 . 


3 


3441 


18 


AAT87922 


Rat cerebellum der 


8 


547.2 


55 . 


1 


1607 


19 


AAV17813 


Mus mus cuius don-1 


9 


535.2 


53. 


8 


1561 


25 


ABS56034 


cDNA encoding muri 


10 


492 


49 . 


5 


2268 


19 


AAV17816 . 


Homo sapiens don-1 


11 


491 


49 . 


4 


1474 


25 


ABS56036 


cDNA encoding huma 


12 


491 


49 . 


4 


2266 


25 


ABS56045 


cDNA encoding huma 


13 


490.4 


49 . 


3 


1476 


19 


AAV17815 


Homo sapiens don-1 


14 


464.6 


46 . 


7 


2467 


19 


AAV17812 


Mus mus cuius don-1 


15 


455. 8 


45. 


9 


2442 


25 


ABS56033 


cDNA encoding muri 


16 


424 


42 . 


7 


1054 


24 


ABL40993 


Human neuregulin 2 


17 


380 


38 . 


2 


667 


18 


AAT87924 


Human cerebellum d 


18 


173 


17 . 


4 


419 


24 


ABL40994 


Human neuregulin 2 


19 


137.2 


13 . 


8 


1039 


23 


AAS71393 


DNA encoding novel 


20 


124.6 


12. 


5 


480 


24 


ABL40995 


Human neuregulin 2 


21 


122.4 


12 . 


3 


350 


24 


ABL40996 


Human neuregulin 2 


22 


95.4 


9 . 


6 


1140 


15 


AAQ58321 


GGF2BPP2. Bos tau 


23 


95.4 


9 , 


6 


1140 


15 


AAQ62840 


GGF2BPP2 . Bos tau 


24 


95.4 


9 , 


6 


1140 


16 


AAQ74912 


Bovine glial cell 


25 


95.4 


9 . 


. 6 


1140 


17 


AAT48088 


Human neuregulin G 


26 


95.4 


9 . 


, 6 


1140 


17 


AAT31001 


Glial growth facto 


27 


95.4 


9. 


, 6 


1140 


17 


AAT06731 


BPP2 glial growth 


28 


95.4 


9 . 


. 6 


1140 


20 


AAX81201 


Nucleotide sequenc 


29 


91.8 


9 . 


2 


1193 


13 


AAQ30670 


GGF2BPP2 .CDS. Syn 


30 


91.8 


9. 


. 2 


1193 


15 


AAQ58303 


GGF-II cDNA sequen 


31 


91.8 


9, 


. 2 


1193 


15 


AAQ62849 


GGF-II cDNA sequen 


32 


91.8 


9, 


.2 


1193 


16 


AAQ74885 


Putative bovine gl 


33 


91.8 


9. 


.2 


1193 


17 


AAT48079 


Bovine neuregulin 


34 


91.8 


9. 


.2 


1193 


17 


AAT30997 


Bovine glial growt 


35 


91.8 


9, 


.2 


1193 


17 


AAT06703 


Bovine glial growt 


36 


87 


8, 


.8 


300 


24 


ABL40997 


Human neuregulin 2 


37 


79 


7. 


. 9 


1986 


20 


AAZ32061 


Human METH2 relate 


38 


78.2 


7, 


.9 


1027 


22 


AAF8 0062 


Nucleotide sequenc 


39 


78.2 


7, 


.9 


3086 


22 


AAF80059 


Nucleotide sequenc 


40 


77.4 


7, 


. 8 


1986 


22 


AAC90318 


L12260 cDNA clone. 


41 


77.4 


7, 


.8 


2003 


17 


AAT48090 


Human neuregulin G 


42 


77.4 


7 


. 8 


2003 


17 


AAT30995 


Glial growth facto 


43 


77.4 


7 


.8 


2003 


17 


AAT06739 


Glial growth facto 


44 


77.4 


7 


.8 


2003 


20 


AAZ32062 


Human METH2 relate 


45 


77.4 


7 


. 8 


2003 


22 


AAC90319 


136352 cDNA clone. 



ALIGNMENTS 



RESULT 1 
AAS18019 

ID AAS18019 standard; cDNA; 994 BP. 
XX 

AC AAS18019; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human cDNA encoding Neuregulin-2alpha, NRG-2alpha. 
XX 

KW Human; ss; neuregulin-2; NRG-2alpha; NRG-2beta; mitogenesis; 

KW cell survival; cell growth; cell differentiation; erbB receptor; 

KW cardiomyopathy; ischaemic damage; cardiac trauma; heart failure; 

KW atherosclerosis; vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 1..993 

FT /*tag= a 

FT /product= "NRG-2alpha" 

XX 

PN WO200189568-A1. 

XX 

PD 29-NOV-2001. 
XX 

PF 23-MAY-2001; 2001WO-US16896 . 
XX 

PR 23-MAY-2000; 2000US-2064 95P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR P-PSDB; AAU11635. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating 

PT multiple sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

PT disease, by increasing mitogenesis, survival, growth or differentiation 

PT of a cell - 

XX 

PS Claim 57; Fig 6; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human 

CC NRG-2alpha or NRG-2beta (clone 2b7) and the polynucleotides encoding 

CC the. Also included are a vector expressing the protein, a host cell 

CC comprising the vector, a transgenic non-human animal transformed with 



CC the vector or having a knockout mutation in one or both NRG-2 

CC alleles and an anti-NRG-2 antibody. Analysis of mutations in NRG-2 in an 

CC individual is useful for diagnosing an increased likelihood of 

CC developing a NRG-2-related disease or condition in a test subject. 

CC NRG-2 is useful for increasing the mitogenesis, survival, growth or 

CC differentiation of a cell (e.g. a neuronal cell), where the cell 

CC expresses an erbB receptor. NRG-2 is useful for treating diseases 

CC and disorders such as cardiomyopathy (preferably degenerative congenital 

CC disease), ischaemic damage, cardiac trauma or heart failure or which 

CC has a condition affecting smooth muscle which include atherosclerosis, 

CC vascular lesion, vascular hypertension, and degenerative congenital 

CC vascular disease, myasthenia gravis, a neurodegenerative disorder, 

CC peripheral neuropathy, a sensory nerve fiber neuropathy, a motor fiber 

CC and a sensory nerve fiber neuropathy, multiple sclerosis, amyotrophic 

CC lateral sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

CC disease, Parkinson's disease, cerebellar ataxia, and spinal cord injury. 

CC The antibody is useful for treatment of a tumour comprising inhibiting 

CC proliferation of a tumour cell preferably a glial tumour cell, for 

CC treating of neurofibromatosis by inhibiting glial cell mitogenesis. 

CC The present sequence encodes NRG-2alpha. 

XX 

SQ Sequence 994 BP; 230 A; 279 C; 304 G; 181 T; 0 other; 



Query Match 100.0%; Score 994; DB 24; Length 994; 

Best Local Similarity 100.0%; Pred. No. 1.4e-231; 

Matches 994; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


l 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 




1 1 1 1 I 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 I 1 II 1 1 1 1 1 1 




Db 


l 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Qy 


61 


TACT C GC C CAGC CT CAAGT CAGT GC AGGACC AGG C GT ACAAGGCAC C C GT GGT GGT GGAG 


120 




1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 i 1 1 1 1 1 ii 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 ii i 




Db 


61 


TACT CGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGT GGT GGAG 


120 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 I" 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTT^AGACGGCCTTTGCC 


360 


Qy 


361 


C CC CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GC ACT GAC 


420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 




Db 


361 


C C C CT CGAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACTGAC 


420 


Qy 


421 


T GC GC C ACCC GGC CCAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAG GT GGGT GAGAAG 


480 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



421 T GC GC CAC C C GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAG 480 

481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I I I I I I II I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
481 CAAT C GCT GAAGT GT GAGGC AGC AG CC GGTAAT C C C CAGC CT T C CT ACC GTT GGT T CAAG 5,4 0 

541 GAT GGCAAGGAGCT CAAC CGC AGC C GAGACATT C GCAT CAAAT AT G GCAAC G GCAGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
541 GAT GGCAAGGAGCT CAACCGCAGCCGAGACATTCGCAT CAAAT AT GGCAACGGCAGAAAG 600 

601 AACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 660 

I | | | | | | | I | | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
601 AACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CTGC GAG 660 

661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

721 AC C ACC CT GT CAT C CT GGTC GG GGCACGC C C GGAAGT GCAAC GAGAC AGC CAAGT CCT AT 78 0 

I I I I I I I I I II ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

781 T GCGTCAATGGAGGCGT CTGCT ACTACATCGAGGGCAT CAACCAGCT CT CCT GCAAAT GT 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I 
7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

841 C CAAAT GGATT CT T C GGACAGAGAT GTT T GGAGAAACT GC CT T T GC GAT T GT ACAT GC CA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 C CAAAT GGAT T CTT C GGACAGAGAT GTT T GGAGAAACT GCCTTT GCGATT GT ACAT GC CA 900 

901 GAT CCT AAG CAAAGT GT CCT GT G GGAT AC AC CGGGGACAGGT GT CAGCAGT T C GCAAT GG 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
901 GAT C CTAAGCAAAGT GT C CT GT GGGAT AC AC C GGGGACAGGT GT CAGCAGT T C GCAAT GG 960 

961 T CAACT T CT C CAAG CAC CTT GGAT T T GAAT TAAA 994 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
961 TCAACTTCTCCAAGCAC CTT GGAT TTGAATT AAA 994 



RESULT 2 
AAV17814 

ID AAV17814 standard; cDNA; 1884 BP. 
XX 

AC AAV17 814; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 

KW epithelial cell; proliferation; stimulation; treatment; tumours; 

KW skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 

KW gastrointestinal tract; uterus; wound healing; transmembrane; ss, 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



FT CDS 664.-1884 

FT /*tag= a 

FT /note= "don-l polypeptide" 
XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. . 
XX 

PF 18-AUG-1997; 97WO-US14585 . 
XX 

PR 19-NOV-1996; 96US-0753007 . 

PR 19-AUG-1996; 96US-0699591 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Bus field SJ, Gearing DP; 
XX 

DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48381. 
XX 

PT Mouse and human don-l polypeptide ( s ) - useful for treatment of 

PT melanomas and adenocarcinoma ( s ) , and for wound healing 

XX 

PS Claim 4; Fig 3; 121pp; English. 
XX 

CC The sequence is that of a human don-l gene splice variant. 

CC Don-l polypeptides stimulate proliferation of epithelial cells 

CC and thus are implicated in melanomas and adenocarcinomas in which 

CC epithelial cells proliferate out of control. Compounds that 

CC interfere with don-l mediated cell proliferation can be used 

CC in the treatment of tumours such as melanomas and adenocarcinomas 

CC of the skin, oesophagus, lung, breast, liver, pancreas, 

CC gastrointestinal tract, colon, prostate or uterus. Alternatively, 

CC don-l polypeptides can be used to stimulate epithelial cell 

CC proliferation, e.g. for wound healing. 

XX 

SQ Sequence 1884 BP; 426 A; 607 C; 560 G; 291 T; 0 other; 

Query Match 90.5%; Score 900; DB 19; Length 1884; 
Best Local Similarity 99.3%; Pred. No. l.le-208; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 
I | | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I Ml M I I I I I I I I I I I I M I I I I I I I 



I | | | | | | | | I I I I I I II I I I I I I I I I I I II M I I I I I I I I I I I I II I I I I I I I I I I I I I I 



I I I II I I I I I I II II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 



CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

| | | | I I I I I II I I I I II I I II II I I I II I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 



Qy 


1 


Db 


218 


Qy 


61 


Db 


278 


Qy 


121 


Db 


338 


Qy 


181 


Db 


398 



Qy 


241 


Db 


458 


Qy 


301 


Db 


518 


Qy 


361 


Db 


578 


Qy 


421 


Db 


637 


Qy 


481 


Db 


697 


Qy 


541 


Db 


757 


Qy 


601 


Db 


817 


Qy 


661 


Db 


877 


Qy 


721 


Db 


937 


Qy 


781 


Db 


997 


Qy 


841 


Db 


1057 


Qy 


901 


Db 


1117 



GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I II ! I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I fl I I 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I II II II I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

C C CCT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 420 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CC C CT - GAT ACCAAC GGCAAAAAT CT CAAGAAAGAG GT GGGCAAGAT CCT GT GCACT G GC 636 

T GC GC CAC C C GGC CCAAGT T GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAG 480 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T GC GCC AC C C GGC C CAAGT T GAAGAAGAT GAAGAGCCAGAC GGGAC AGGT GGGTGAGAAG 696 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I II I I I II I I I II I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

GAT GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCAT CAAATATGGCAACGGCAGAAAG 600 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

GAT GGCAAGGAGCT CAACCGCAGCCGAGACATTCGCATCAAAT AT GGCAACGGCAGAAAG 816 

AACT C ACGACT ACAGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CTGCGAG 660 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
AACT CAC GAC T ACAGT T CAACAAGGTGAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 876 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I | | I I I I I I I I II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I. I M I I I I I I I I I I I 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

AC CAC CCT GT CAT CCT GGT CGGGGCACGCC C GGAAGT GCAAC GAGACAGCCAAGT CCT AT 780 
I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I 
ACCACCCT GT CATCCT GGT CGGGGCACGCCCGGAAGT GCAAC GAGACAGCCAAGT CCTAT 996 

T GCGT CAATGGAGGCGTCT GCTACT ACAT CGAGGGCAT CAACCAGCT CTCCTGCAAATGT 840 

I I I II I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGCGTCJWTGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

C CAAATGGAT T CTT C GGACAGAGAT GTT T GGAGAAACT GC CTTT GC GATT GT AC AT GCC A 900 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I 
CCAAAT GGAT T CTT CGGAC AGAGAT GTTT GGAGAAACT GC CTTT GC GATT GT AC AT GC CA 1116 

GAT C CTAAGCAAAGT GT C CT 92 0 

I I I I I I I I II I I I Ml 

GATCCTAAGCAAAAGCACCT 1136 



RESULT 3 
AAT87923 

ID AAT87923 standard; cDNA; 1803 BP. 
XX 

AC AAT87923; 
XX 

DT 18-DEC-1997 (first entry) 



XX 

DE Rat cerebellum derived growth factor 2 cDNA. 
XX 

KW Rat; cerebellum derived growth factor; CDGF2; screening; binding; 

KW modulation; erbB type receptor; identification; indication; risk; 

KW proliferation; differentiation; induction; neuron; hyperplasia; 

KW stem cell culture; intracerebral graft; alleviation; repair; 

KW behavioural defect; nervous system; central; peripheral; nerve; 

KW prothesis; damage; entubulation; cell survival; treatment; 

KW injury; trauma; ischaemia; ischemia; stroke; infection; disorder; 

KW inflammation; neurodegeneration; disease; Parkinson's; 

KW Huntingdon's; amylotrophic lateral sclerosis; sensory; retina; 

KW spinocerebellar degeneration; multiple sclerosis; neoplasia; 

KW amalignant glioma; medulloblastoma ; neuroectodermal tumour; ds . 

XX 

OS Rattus rattus . 
XX 

FH Key Location/Qualifiers 

FT CDS 1..993 

FT /*tag= a 

FT sig_peptide 1..69 

FT /*tag= b 

FT mat_peptide 70 . . 990 

FT /*tag= c 

FT /product= ce rebel lum__derived_growth_f actor 
XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997. 
XX 

PF 09-SEP-1996; 96WO-US14484 . 
XX 

PR 08-SEP-1995; 95US-0525864 . 
XX 

PA (HARD ) HARVARD COLLEGE . 

PA (STRD ) UNIV LELAND STANFORD JUNIOR. 

PA (STRD ) UNIV LELAND S STANFORD. 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR P-PSDB; AAW27537. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the 

PT treatment of neuronal injury and proliferative disorders 

XX 

PS Claim 17; Pages 70-74; 94pp; English. 
XX 

CC The present sequence encodes rat cerebellum derived growth factor 2 

CC (CDGF2 ) , which can be used to screen for modulators of CDGF 

CC binding to erbB type receptors. Identification of a modification or 

CC mutation in a CDGF gene, or aberrant expression of a CDGF gene or 

CC levels of soluble CDGF may be used to indicate the risk of unwanted 

CC cell proliferation or differentiation. 

CC CDGF may be used to induce neuronal differentiation in stem cell 

CC culture, and maintain the integrity of a terminally differentiated 

CC neuronal cell culture, e.g. useful for intracerebral grafting to 



CC alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially 

CC where a crushed or severed axon is entubulated by a prosthetic. 

CC CDGF may also be used to enhance neuronal cell survival in the 

CC central or peripheral nervous system, to treat neurological 

CC conditions associated with nervous system injury, e.g. traumatic, 

CC chemical or vasal injury and deficits such as ischaemia resulting 

CC from stroke, infectious/inflammatory and tumour induced injury, 

CC chronic neurodegenerative disease including Parkinson's and 

CC Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the 

CC central nervous system, e.g. amalignant gliomas, medulloblastomas 

CC and neuroectodermal tumours. 

XX 

SQ Sequence 1803 BP; 408 A; 549 C; 537 G; 309 T; 0 other; 

Query Match 88.7%; Score 882; DB 18; Length 1803; 

Best Local Similarity 93.0%; Pred. No. 2.6e-204; 

Matches 924; Conservative 0; Mismatches 70; Indels 0; Gaps 0; 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 60 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I U I I I I I 

TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I II I I I I I I I III II M M I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 180 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I M I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I 

CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 300 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 360 

CCCCT CGATACCAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCTGTGCACT GAC 42 0 

M | | | I I I I I I I I I I I I I III Ml I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

CCGGT CGACC CTAACGGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CTGT GCACT GAC 420 

TGC GC CAC C CGGC CCAAGTT GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAG 4 8 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 
T GCGCAAC C C GGCC CAAGCT GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGCGAGAAG 480 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 



Qy 



481 



CAAT C GCT GAAGT GT GAGGC AGCAGC C GGT AAT C C C C AGC CTT C CT AC C GTT GGT T CAAG 
II I I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I II I I I I I I I I I 



540 



Db 


481 


CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


54 0 


Ov 


541 


GAT GG CAAGGAGCT CAAC C G CAGC CGAGAC ATT C GC AT CAAAT AT GGCAACG GC AGAAAG 


600 




|| 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 II II 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 




Db 


541 


GAC GGCAAGGAGCT CAACC GGAGT CGT GACAT T C GCAT CAAGT AT GGCAAC GGCAGAAAG 


rr r\ r\ 

oUU 


Ov 
vy 


601 


AACT C ACGACT ACAGT T CAACAAGGTGAAGGT GGAGGAC GCT GGGGAGT AT GT CT GCGAG 


660 




I I I I I I I I | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 IN 




Db 


601 


AACT CAC GGCT ACAGTT CAACAAAGT GAAG GT GGAGGAC GCT GGAGAGT ACGT CT GT GAG 


660 


Ov 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 




|| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 




Db 


661 


GCT GAGAACAT CCT T GGGAAGGACACT GT GAGGGGC C GGCT CCAT GT CAACAGT GT GAGC 


720 


Ov 
wy 


721 


AC CAC C CT GT CAT CCT GGT C GGGGCACGC C C GGAAGT GCAACGAGACAGC CAAGT C CTAT 


780 




Mill 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


721 


AC C ACT CT GT CGT C CT GGT C GGGGCACGC C C GGAAGT GCAAT GAGACAGCCAAGT CCTAC 


780 


Ov 


781 


T GC GT CAAT GGAGGC GT CT GCT ACT ACAT C GAGGGCAT CAAC CAGCT CT C CT GCAAAT GT 


840 




II || 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 




Db 


781 


TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCAT CAAC CAACT CT CCT GCAAATGT 


840 


Ov 
vy 


841 


CCAAAT GGATT CTTCGGACAGAGAT GTTT GGAGAAACTGCCTTTGCGATT GTACAT GCCA 


900 




1 1 1 1 1 i M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I M 1 1 1 1 M 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


C CAAAC GGAT T CTT C GGACAGAGAT GTTT GGAGAAACT GC CTTTGC GAT T GTACAT GCCA 


900 


Qy 


901 


GAT C CT AAGCAAAGT GT C CT GT GGGATAC AC C GGGGACAGGT GTCAGCAGTT C G CAAT GG 


960 




I | | | | | | | | | 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


GAT CCTAAGCAAAGT GT CCT GT GGGATACACCGGGGACAGGT GTCAGCAGTTC GCAAT GG 


960 


Qy 


961 


T CAACT T CT C CAAGCAC CT T GGAT T T GAATTAAA 994 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


961 


T CAACT T CT C CAAGCAC CT T GGATTT GAATTAAA 994 





RESULT 4 
AAS18020 

ID AAS18020 standard; cDNA; 897 BP. 
XX 

AC AAS18020; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human cDNA encoding Neuregulin-2beta, NRG-2beta. 
XX 

KW Human; ss; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; 

KW cell survival; cell growth; cell differentiation; erbB receptor; 

KW cardiomyopathy; ischaemic damage; cardiac trauma; heart failure; 

KW atherosclerosis; vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 



OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 1..897 

FT /*tag= a 

FT /product- "NRG-2beta" 

XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2 001. 
XX 

PF 23-MAY-2001; 2001WO-US16896 . 
XX 

PR 23-MAY-2000; 2000US-206495P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR P-PSDB; AAU11636. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating 

PT multiple sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

PT disease, by increasing mitogenesis, survival, growth or differentiation 

PT of a cell - 

XX 

PS Claim 57; Fig 8; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human 

CC NRG-2alpha or NRG-2beta (clone 2b7) and the polynucleotides encoding 

CC the. Also included are a vector expressing the protein, a host cell 

CC comprising the vector, a transgenic non-human animal transformed with 

CC the vector or having a knockout mutation in one or both NRG- 2 

CC alleles and an anti-NRG-2 antibody. Analysis of mutations in NRG- 2 in an 

CC individual is useful for diagnosing an increased likelihood of 

CC developing a NRG-2-related disease or condition in a test subject. 

CC NRG- 2 is useful for increasing the mitogenesis, survival, growth or 

CC differentiation of a cell (e.g. a neuronal cell), where the cell 

CC expresses an erbB receptor. NRG- 2 is useful for treating diseases 

CC and disorders such as cardiomyopathy (preferably degenerative congenital 

CC disease) , ischaemic damage, cardiac trauma or heart failure or which 

CC has a condition affecting smooth muscle which include atherosclerosis, 

CC vascular lesion, vascular hypertension, and degenerative congenital 

CC vascular disease, myasthenia gravis, a neurodegenerative disorder, 

CC peripheral neuropathy, a sensory nerve fiber neuropathy, a motor fiber 

CC and a sensory nerve fiber neuropathy, multiple sclerosis, amyotrophic 

CC lateral sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

CC disease, Parkinson's disease, cerebellar ataxia, and spinal cord injury. 

CC The antibody is useful for treatment of a tumour comprising inhibiting 

CC proliferation of a tumour cell preferably a glial tumour cell, for 

CC treating of neurofibromatosis by inhibiting glial cell mitogenesis. 

CC The present sequence encodes NRG-2beta. 

XX 

SQ Sequence 897 BP; 200 A; 261 C; 282 G; 154 T; 0 other; 



Query Match 85.4%; Score 849; DB 24; Length 897; 

Best Local Similarity 98.3%; Pred. No. 2.1e-196; 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 TACT C GC C CAGC CT CAAGT CAGT GCAGGAC CAGGC GT ACAAGGCAC C C GT GGT GGT GGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TACT C GC C CAGC CT CAAGT CAGT GCAGGAC CAGGCGT ACAAGGCAC C C GT GGT GGT GGAG 12 0 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I 

Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I 1 I I I I I I I I I I I 1 Mill MINIMI III I I I I I I I I I II I II I II II 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I M I I I 
Db 301 C GCT ACAT CTT T T T C CT GGAGC C C AC GGAACAGCCCT T AGT CTT T AAGAC GGCCT T T GCC 360 

Qy 361 CCCCT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 420 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I. I I I I I I I I I I I I I I 
Db 361 CCCCT CGAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 42 0 

Qy 421 T GCGCCAC C CGG C C CAAGTT GAAGAAGAT GAAGAGCC AGAC GGGAC AGGT GGGT GAGAAG 4 8 0 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I 
Db 421 T GCGCCACCCGGCCCAAGTTGAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGT GAGAAG 480 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 CAAT C GCT GAAGT GT GAGGC AGCAGC CGGTAAT C CCC AGC CT T CCT AC C GT T GGTT CAAG 54 0 

Qy 541 GATGGCAAGGAG CT CAAC CGCAGC C GAGACAT T C GC AT CAAAT AT GGCAAC GGCAGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 541 GAT GGCAAGGAGCT CAAC CGCAGCC GAGAC ATT C GC AT CAAAT ATGGCAAC GGCAGAAAG 600 

Qy 601 AACTCACGACTACAGTTCAACAAGGT GAAGGT GGAGGACGCTGGGGAGTATGTCT GCGAG 660 

I I I I I I I II I I II I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M M M I I I 
Db 601 AACT C AC GACT AC AGTT CAAC AAGGT GAAGGT GGAGGAC GCT GG GGAGT AT GT CT GC GAG 660 

Qy 661 GCC GAGAACAT CCT G GGGAAGGAC AC CGT C C GGGGC C GG CT T T ACGT CAAC AGC GT GAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I II I I I I I I I II I I I I I I I I I I II I I I 

Db 661 GCC GAGAACAT CCT GGGGAAGGACAC C GT C C GGGGCC GGCT T T ACGT CAACAGC GT GAGC 72 0 

Qy 721 AC CACC CT GT CAT CCT GGT CGGG GCACGC C C GGAAGT GCAAC GAGACAGC CAAGT C CT AT 78 0 

II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

Qy 781 T GC GT CAAT G GAGGC GT CT GCT ACT AC AT C GAGGGCAT CAAC C AGCT CT C CT GCAAAT GT 84 0 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 

Db 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

Qy 841 C CAAAT GGATTCT T C GGACAGAGAT GT TT GGAG 873 

II I I I I I III I I I I I I III 

Db 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 



RESULT 5 
AAV43674 

ID AAV43674 standard; cDNA; 3076 BP. 
XX 

AC AAV43674; 
XX 

DT 29-SEP-1998 (first entry) 
XX 

DE Receptor type tyrosine kinase ErbB ligand encoding cDNA. 
XX 

KW Receptor type tyrosine kinase ErbB; ligand; diagnostic agent; 

KW nervous disease; cancer; ss. 

XX 

OS Rattus sp . 
XX 

FH Key Location/Qualifiers 

FT CDS 232.. 2814 

FT /*tag= a 

FT /product= "ligand of receptor type tyrosine kinase ErbB" 
XX 

PN JP10179166-A. 
XX 

PD 07-JUL-1998. 
XX 

PF 25-DEC-1996; 96 JP-0356998 . 
XX 

PR 25-DEC-1996; 96 JP-0356998 . 
XX 

PA (HIGA/) HIGASHIYAMA S. 
XX 

DR WPI; 1998-430952/37. 

DR P-PSDB; AAW63700. 
XX 

PT Gene coding the ligand of the tyrosine kinase ErbB receptor - useful 

PT for diagnosing and treating nervous diseases and cancer 

XX 

PS Examples; Pages 9-13; 17pp; Japanese. 
XX 

CC This cDNA encodes the ligand of receptor type tyrosine kinase ErbB. A 

CC prokaryotic or eukaryotic host cell transformed by a recombinant vector 

CC containing the encoding DNA can be used for the recombinant production of 

CC the protein. The invention provides a method for inhibiting the formation 

CC of the ligand of receptor type tyrosine kinase ErbB in an animal using 

CC an antibody recognizing the protein. The ligand of the tyrosine kinase 

CC ErbB receptor and associated materials can be used for treating or 

CC diagnosing nervous diseases and cancers. 
XX 

SQ Sequence 3076 BP; 673 A; 996 C; 944 G; 463 T; 0 other; 



Query Match 80.4%; Score 799.4; DB 19; Length 3076; 

Best Local Similarity 92.2%; Pred. No. 3.4e-184; 

Matches 8 42; Conservative 0; Mismatches 71; Indels 0; Gaps 



0; 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I II M I I I I I I I I I I I I II I I I 

Db 556 ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 615 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

1 1 1 1 1 ii 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 r I I I I I I I M I I II I I I I I I I I I I I I I I 

Db 616 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 675 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I II I I I I III II I I I I I I I I I I I I I II I I II I I I I I I I I I I I 
Db 676 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 735 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 736 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 795 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I II I I I I II 

Db 7 96 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 855 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 856 C GCT ACAT CT TT T T C CT GGAGCC CAC CGAGC AGC C CTT AGT TTTTAAGACAGC CTT T GC C 915 

Qy 361 CCCCTCGAT ACCAACGGCAAAAAT CTCAAGAAAGAGGT GGGCAAGAT CCTGTGCACT GAC 42 0 

II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 916 C C GGTC GAC CCT AAC GGCAAAAAC AT CAAGAAAGAGGT GGGCAAGAT CCTGT GCACT GAC 975 

Qy 421 T GCGCCACCCGGCCCAAGTT GAAGAAGATGAAGAGCCAGACGGGACAGGTGGGT GAGAAG 4 8 0 

I I I I I I I I I I I I I II II I II I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 

Db 976 T GCGCAACCCGGCCCAAGCT GAAGAAGATGAAGAGT CAGACAGGAGAGGTGGGC GAGAAG 1035 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II I I I I I I I I I I I I I I I I II II II II I I I I I I I I MMI II I I I I I I I I I 

Db 1036 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 1095 

Qy 541 GAT GGCAAGGAGCT CAACCGCAGCCGAGACATTCGCAT CAAATAT GGCAACGGCAGAAAG 600 

II I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1096 GAC GGCAAGGAGCT CAACCGGAGT CGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 1155 

Qy 601 AACTCACGACTACAGTT CAACAAGGTGAAGGT GGAGGACGCTGGGGAGTATGT CTGCGAG 660 

I I I II I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I III 
Db 1156 AACT CACGGCTACAGTT CAACAAAGT GAAGGT GGAGGACGCTGGAGAGT ACGTCTGT GAG 1215 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I I I I I II I II I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1216 GCT GAGAAC ATC CT T GGGAAGGAC ACT GT GAGGGGC C GGCT C CAT GT CAAC AGT GTGAGC 1275 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1276 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1335 

Qy 781 T GCGTCAATGGAGGCGTCT GCT ACT ACAT CGAGGGC AT CAAC CAGCTCT CCT GCAAATGT 840 



II II I I I II M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1336 T GT GT GAAT GGAGGCGT GT GCT ACT AC AT C GAAGGC AT CAAC CAACT CT C CT GCAAAT GT 1395 

Qy 841 C CAAAT GGAT T CT T CGGAC AGAGAT GT T T GGAGAAACT GC CT TT GCGATT GT ACAT GC CA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

Db 1396 C CAAAC G GAT T CT T CGGAC AGAGAT GT TT G GAGAAACT G C CT TT GC GATT GT ACAT GC CA 1455 

Qy 901 GAT C CT AAGCAAA 913 

I I I I I I I I I I I I I 
Db 1456 GAT C CTAAGCAAA 1468 



RESULT 6 
ABS56035 

ID ABS56035 standard; cDNA; 1863 BP. 
XX 

AC ABS56035; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human membrane-bound splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens . 



XX 

FH Key Location/Qualif iers 

FT CDS 643.. 1863 

FT /*tag= a 

FT /partial 

FT /product^ "Membrane-bound splice variant of Don-1" 

FT /note= "This sequence lacks a stop codon" 

XX 



PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241-. 
XX 

PR 22-JUN-2000; 2000US-0599789 . 
XX 

PA (GEAR/) GEARING DP. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71638. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 



PS Claim 4; Fig 3; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins. 

CC Don-1 polypeptides are glycoprotein ligands . Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome IE 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell. 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes human membrane -bound 

CC splice variant of Don-1. 

XX 

SQ Sequence 1863 BP; 422 A; 602 C; 553 G; 286 T; 0 other; 

Query Match 79.3%; Score 788; DB 25; Length 1863; 

Best Local Similarity 97.6%; Pred. No. 1.7e-181; 

Conservative 0; Mismatches 5; Indels 17; Gaps 



I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I 



I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

TACT C GCC CAGC CT CAAGT CA — GC AGGAC C AGGCGT ACAAGGCAC C C GT GGT GGT GGAG 32 8 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGCAAGGTACAGGGGCT GGT — CAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 38 6 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I I I I 

CCCGCCTCGGGTCGGGTGGCG— GGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 44 4 

GGGCTGCAGCGCGAGCAGGT GAT CAGC GTGGGCTCCTGTGT GCC GCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M.I I I I I I I I I I I I I I I I 

GGGCTGCAGCGCGAGCAGGTG— CAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 502 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I II I II I I II III I I I I I I I I I I I I I I I I I I I I I I I I I 

CGCTACATCTTTTTCCTGGAG — CACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 560 

CC C CTC GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 420 

Mill I II II I I I II I II I II II II II I II I I I I I I I I I I I I I I I I II I II I I I I I 

C C C CT - GAT AC CAAC GGCAAAA — CT CAAGAAAGAGGTGG GCAAGAT C CT GT GC ACT GGC 617 



Matches 


89* 


QY 


1 


Db 


213 


Qy 


61 


Db 


271 


Qy 


121 


Db 


329 


Qy 


181 


Db 


387 


Qy 


241 


Db 


445 


Qy 


301 


Db 


503 


Qy 


361 


Db 


561 



Qy 



421 



TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 480 
I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I 



Db 


618 


TGCGCCACCCGGCCCAAGTTGA — AAGATGAAGAGCCAGACGGGACAGGTGGG 1 GAGAAG 


gin c 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


540 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 II 1 I 1 11 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


676 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 




Qv 


541 


GATGGCAAGGAGCT CAACCGCAGCCGAGACATT CGCAT CAAAT AT GGCAACGGCAGAAAG 


600 




1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


736 


GAT GGCAAGGAGCT CAACCGCAGCCGAGACATT CGCAT CAAAT ATGGCAACGGCAGAAAG 


/ y o 


Qv 


601 


AACT C AC GACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCTGGGGAGT ATGT CT GC GAG 


660 




1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


796 


AACT CAC GACT AC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 


O D D 


Qv 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 




I | | | | 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


856 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


915 


Qv 


721 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 


780 






I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


916 


AC CAC C CT GT CAT C CT GGT CGGGGCACGC CC GGAAGT GCAAC GAGACAGC CAAGT C CT AT 


975 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 


840 




I | | | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 I.I 1 1 1 1 1 1 IN 1 1 




Db 


976 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 


1035 


Qy 


841 


CCAAAT GGATT CTT C GGAC AGAGAT GTT T GGAGAAACT G C CTTTGC GATT GT ACAT GC CA 


n a a 

900 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 




Db 


1036 


C CAAAT GGATTCT T CGGACAGAGAT GTT T GGAGAAACT GCCT TT GC GAT T GT ACAT GCCA 


1095 


Qy 


901 


GAT C CTAAGCAAAGT GT C CT 92 0 




Db 


1096 


1 1 1 1 1 1 1 1 1 1 1 1 1 III 

GATC CT AAGCAAAAGCAC CT 1115 





RESULT 7 
AAT87922 



ID AAT87922 standard; cDNA; 3441 BP. 
XX 

AC AAT87922; 
XX 

DT 18-DEC-1997 (first entry) 
XX 

DE Rat cerebellum derived growth factor 1 cDNA. 
XX 

KW Rat; cerebellum derived growth factor; CDGF1; screening; binding; 

KW modulation; erbB type receptor; identification; indication; risk; 

KW proliferation; differentiation; induction; neuron; hyperplasia; 

KW stem cell culture; intracerebral graft; alleviation; repair; 

KW behavioural defect; nervous system; central; peripheral; nerve; 

KW prothesis; damage; entubulation; cell survival; treatment; 

KW injury; trauma; is chaemia; ischemia; stroke; infection; disorder; 

KW inflammation; neurodegeneration; disease; Parkinson's; 

KW Huntingdon's; amylotrophic lateral sclerosis; sensory; retina; 

KW spinocerebellar degeneration; multiple sclerosis; neoplasia; 

KW amalignant glioma; medulloblastoma; neuroectodermal tumour; ds . 

XX 



OS Rattus rattus . 
XX 

FH Key Location/Qualifiers 

FT CDS 180.. 2444 

FT /*tag= a 

FT sig_j?eptide 180.. 248 

FT /*tag= b 

FT mat_peptide 249.. 2441 

FT /*tag= c 

FT /product- cerebellum derived_growth_f actor 
XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997. 
XX 

PF 09-SEP-1996; 96WO-US14484 . 
XX 

PR 08-SEP-1995; 95US-0525864 . 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAND STANFORD JUNIOR. 

PA (STRD ) UNIV LELAND" S STANFORD. 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR P-PSDB; AAW27536. 
XX 

PT Rat and human cerebellum-derived growth factors - used in the 

PT treatment of neuronal injury and proliferative disorders 

XX 

PS Claim 17; Pages 63-66; 94pp; English. 
XX 

CC The present sequence encodes rat cerebellum derived growth factor 1 

CC (CDGF1) , which can be used to screen for modulators of CDGF 

CC binding to erbB type receptors. Identification of a modification or 

CC mutation in a CDGF gene, or aberrant expression of a CDGF gene or 

CC levels of soluble CDGF may be used to indicate the risk of unwanted 

CC cell proliferation or differentiation. 

CC CDGF may be used to induce neuronal differentiation in stem cell 

CC culture, and. maintain the integrity of a terminally differentiated 

CC neuronal cell culture, e.g. useful for intracerebral grafting to 

CC alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially 

CC where a crushed or severed axon is entubulated by a prosthetic. 

CC CDGF may also be used to enhance neuronal cell survival in the 

CC central or peripheral nervous system, to treat neurological 

CC conditions associated with nervous system injury, e.g. traumatic, 

CC chemical or vasal injury and deficits such as ischaemia resulting 

CC from stroke, infectious/inflammatory and tumour induced injury, 

CC chronic neurodegenerative disease including Parkinson's and 

CC Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the 

CC central nervous system, e.g. amalignant gliomas, medul lob la stomas 



CC and neuroectodermal tumours . 
XX 

SQ Sequence 3441 BP; 777 A; 1057 C; 1015 G; 592 T; 0 others- 
Query Match 74.3%; Score 738.6; DB 18; Length 3441; 
Best Local Similarity 90.4%; Pred. No. 2e-169; 

Matches 789; Conservative 0; Mismatches 84;. Indels 0; Gaps 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 180 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 240 T ACT CGC CC AGC CT CAAGT C C GT GCAGGAC CAGGC GT ACAAGGC AC CCGT GGT GGTGGAG 299 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I M I I I I I I II II I I I I I I I I I I I I I M I I M I I I I I I 

Db 300 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 359 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I 
Db 360 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 419 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

M | | | | | | || | I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 420 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 479 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I II I I I I I I I I I 
Db 4 80 C GCTACAT CT T T T T C CT GGAGC C CAC C GAGC AGC C CT T AGTT TT TAAGACAGC CT T TGC C 539 

Qy 361 C CC CT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 420 

|| | | | I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 54 0 C C GGT C GACCCTAAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 599 

Qy 421 T GC GCC AC C CGGC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAG 480 

I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I 

Db 600 T GC GCAAC C CGGC CCAAGCTGAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGC GAGAAG 659 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II M I I I I I I I I I I I I I I II I I II II I I I I I I I I I I I I I M I I I I I I I I I 

Db 660 CAGTCGCT CAAGT GTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 719 



Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

II I I I I I I I I I I I I I I I I I II M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 720 GACGGC7KAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 779 

Qy 601 AACT CAC GACT ACAGT T CAACAAGGT GAAGGT GGAGGACGCT GGGGAGT AT GT CT GC GAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 7 80 AACT CACGGCT ACAGTT CAACAAAGTGAAGGT GGAGGACGCT GGAGAGTACGTCT GTGAG 839 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M 

Db 84 0 GCT GAGAAC ATCCT T GGGAAGGAC ACT GT GAGGGGCC GGCT C CAT GT CAACAGT GT GAGC 899 



Qy 



721 ACCACCCT GT CAT CCT GGT CGGGGCACGCCCGGAAGT GCAAC GAGACAGCCAAGT CCT AT 78 0 



I I I I I I M I I I I I I I I I I I M I I I I I I M I I M I I I I I I I I I I II I I I I I I I I I II 

D b 900 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 959 

Qy 781 T GCGT CAAT GGAG GC GT CT GCT ACT AC AT C GAG GGC AT CAACCAGCT CTCCT GCAAAT GT 840 

|| || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I Ml 
D b 960 T GT GT GAAT GGAGGC GT GT GCT ACT ACAT C GAAG GC AT CAACCAACT CT CCT GCAAGT GT 1019 

Qy 841 C C AAAT G GAT T C T T C G G AC AG AG AT GT T T G GAG 873 

II I I I I I III I I I I I I III 

Db 1020 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 1052 



RESULT 8 
AAV17813 

ID AAV17813 standard; cDNA; 1607 BP. 
XX 

AC AAV17 813; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Mus musculus don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 

KW epithelial cell; proliferation; stimulation; treatment; tumours; 

KW skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 

KW gastrointestinal tract; uterus; wound healing; secreted protein; ss. 

XX 

OS Mus musculus. 
XX 

FH Key Location/Qualifiers 
FT CDS 7 9. . 62 4 

FT /*tag= a 

FT /note= "secreted don-1 polypeptide" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US14585 . 
XX 

PR 19-NOV-1996; 96US-0753007 . 
PR 19-AUG-1996; 96US-0699591 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Busfield SJ, Gearing DP; 
XX 

DR WPI; 1998-169084/15. 
DR P-PSDB; AAW48380. 
XX 

PT Mouse and human don-1 polypeptide (s ) - useful for treatment of 

PT melanomas and adenocarcinoma ( s ) , and for wound healing 

XX 

PS Claim 4; Fig 2; 121pp; English. 
XX 

CC The sequence is that of a murine don-1 gene splice variant. 
CC Don-1 polypeptides stimulate proliferation of epithelial cells 



and thus are implicated in melanomas and adenocarcinomas in which 
epithelial cells proliferate out of control. Compounds that 
interfere with don-1 mediated cell proliferation can be used 
in the treatment of tumours such as melanomas and adenocarcinomas 
of the skin, oesophagus, lung, breast, liver, pancreas, 
gastrointestinal tract, colon, prostate or uterus. Alternatively, 
don-1 polypeptides can be used to stimulate epithelial cell 
proliferation, e.g. for wound healing. 

Sequence 1607 BP; 365 A; 500 C; 480 G; 262 T; 0 other; 

Query Match 55.1%; Score 547.2; DB 19; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 4.8e-123; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 

371 CCAACGGCAAAAAT CT CAAGAAAGAGGTGGGCAAGAT CCTGT GCACT GACTGCGCCACCC 430 
| | | | I I I II I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I II II II I I I I I I I I I I 
2 CTAAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT CCT GT GCACT GACT GCGC C AC CC 61 

4 31 GGC CCAAGTT GAAGAAGAT GAAGAGC C AGACGGGACAGGT GGGT GAGAAGCAAT C GCT GA 490 

| I I I I I I I I I I I I I I I I I I I I I II I III II I I I I I I I I I I I I I I I 

62 GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 121 

4 91 AGT GT GAGGCAGCAGC C GGTAAT C C CC AGCCT T C CT AC C GTT GGT T CAAGGAT GGCAAGG 550 

| | | | | | | I I I I I I II II II I I I I I I I I I I I M M I I I I I I M I I I I I I I I I M 
122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

551 AGCTCAACCGCAGCCGAGACATTCGCAT CAAAT AT GGCAACGGCAGAAAGAACTCACGAC 610 

| | | | | | || | || II II I I I I I I I II II II I I I I I I I I I I I I I I I I I I 

182 AACTCAACCGGAGTCGT GAT ATT CGCAT CAAGTATGGCAAT GT CAGAAAGAACTCACGGC 241 

611 T AC AGT T CAACAAGGT GAAGGT GGAG GACGCT GGGGAGT AT GT CT GC GAGGC C GAGAAC A 670 

| | | | M I II I I I I I I I I MINIMI II I I I I II I I I I I I I I I I I I I I I I I I I I 
242 T ACAGTT CAACAAAGT GAGGGT GGAGGATGCCGGGGAGTACGT CT GT GAGGCCGAGAACA 301 

671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

MM | || II I I I I I I I I I I M I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 

302 T C CT T GGGAAGGACAC CGT GAGGGGCC GACT C CAT GT CAACAGCGT GAGCACC ACT CTGT 361 

731 CAT C CT GGT C GGGGCAC GC C C GGAAGT GCAAC GAGACAGCCAAGT C CT AT T GC GT CAAT G 7 90 

I || | I I I I II I II II II I I I I M I I I I I I Mill II II I II I I II II II MM 

362 CAT C CT GGT C GGGACAT GCCC GGAAGT GCAAT GAGACCGCGAAGT C CT ACT GT GT GAAT G 421 

791 GAGGCGTCTGCT ACTACATC GAGGGCAT CAACCAGCTCTCCT GCAAAT GTC CAAAT GGAT 850 

| | | I I M | || II I I II I I I I II II II I I I II I II I I I II I I II I I I M I I II M II I I 

422 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 481 

851 T CT T C GGACAGAGAT GTT T GGAGAAACT GC CTT T GCGATT GT AC AT GC C AGAT C CTAAGC 910 

I I M I I M Ml II I I M II I I II I II II I II Ml II II II I I II I I M I I M II I I I I I I 

482 T CT T C GGACAGAGAT GTTTG GAGAAACT GC CT T T GC GATT GT AC AT GC CAGAT C CTAAGC 541 
911 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 970 

I | || | | | II I I II I I II II II I M II II M I I I II M II II I II I I II I I I I M II II II 

542 AAAGTGT CCT GT GGGATACACCGGGGACAGGT GT CAGCAGTTCGCAAT GGTCAACTT CT C 601 

971 CAAG C AC CT T GGAT TT GAATT AAA 994 
I I II II II II M I I I I II I M II 



Db 602 CAAGCAC CT T GGATT T GAAT T GAA 625 



RESULT 9 
ABS56034 

ID ABS56034 standard; cDNA; 1561 BP. 
XX 

AC ABS56034; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding murine secreted splice variant of Don-1. 
XX 

KW Murine; Don-1; epidermal growth factor; EGF; neuregulin; mouse; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; chromosome 18; gene; ss. 

XX 

OS Mus sp. 
XX 

FH Key Location/Qualifiers 

FT CDS 78.. 623 

FT /*tag= a 

FT /product^ "Secreted splice variant of Don-1" 
XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2 000US-05997 89 . 
XX 

PA (GEAR/) GEARING D P. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71637. 
XX 

PT Novel, Don-1 polypeptide useful for stimulating proliferation of cells 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 

PS Claim 4; Fig 2; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins. 

CC Don-1 polypeptides are glycoprotein ligands . Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 1 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 



CC proliferation of carcinomas e.g. adenocarcinoma , myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes murine secreted 

CC splice variant of Don-1. 
XX 

SQ Sequence 1561 BP; 361 A; 479 C; 465 G; 256 T; 0 other; 

Query Match 53.8%; Score 535.2; DB 25; Length 1561; 

Best Local Similarity 92.1%; Pred. No. 3.9e-120; 

Matches 575; Conservative 0; Mismatches 48; Indels 1; Gaps 1; 

C CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GCGC CAC C C 430 

I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I II 

CTAAC GGCAAAAACAT CAAGAAAGAG GTGGGC AAGAT C CTGT GC ACT GACT GCGC CA- C C 60 

GGCCCAAGT T GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAGCAAT C GCT GA 4 90 
I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGCCCAAGCTGAAGAAGAT GAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGT CGCT CA 120 

AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

| I I I I I I I I I I I I II II II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 180 

AGCT CAAC C GC AGC C GAGAC ATT C GCAT CAAAT AT GGCAACGGC AGAAAGAACT CAC GAC 610 
I I I I I I I I I II II II I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I 
AACT CAACCGGAGT CGTGAT ATT CGCATCAAGT AT GGCAATGT C AGAAAGAACT CAC GGC 240 

T ACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGT AT GT CT GCGAGGC C GAGAACA 67 0 
I I I I I I I I I I I I I I I I I MIMMII II I I I I I I I I I I I I I I I I I II I I I I I I I 
TACAGTT CAACAAAGT GAGGGT GGAGGATGCC GGGGAGT AC GT CT GT GAGGC CGAGAAC A 300 

TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I | || | | I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I 

T C CT T GGGAAGGAC AC C GT GAGGGGC C GACT CCAT GT CAAC AGC GT GAGC AC C ACT CT GT 360 

CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I II II I I I I 

CAT C CT GGT C GGGACAT GCCCGGAAGT GCAAT GAGAC C GCCAAGT CCT ACT GT GT GAAT G 42 0 

GAGGC GT CT GCT ACT ACATCGAGGGC AT CAACC AGCT CT CCT GCAAAT GT C CAAAT GGAT 850 

I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I M M I I I I I I I I I I M I I I I N I 
GAGGC GT GT GCT ACTACAT CGAGGGCAT CAACCAGCT CT CCT GCAAAT GT C CAAACGGAT 48 0 

T CT T C GGAC AGAGAT GTT T GGAGAAACT GC CTT T GC GAT T GT AC AT GC C AGAT C CTAAGC 910 
I | | | | | | | | I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I M I M I M I I I I 
T CT T CGGACAGAGAT GT T T GGAGAAACT GC CTT T GC GAT T GT ACAT GC CAGAT C CTAAGC 540 

AAAGT GT CCTGTGGGATACACCGGGGACAGGTGT CAGCAGTT C GCAAT GGT CAACTTCTC 97 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I 

AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 600 



Qy 


371 


Db 


2 


Qy 


431 


Db 


D 1 


Qy 


491 


DO 




Qy 


box 




.1 oi 

lol 


Qy 


611 


Db 


241 


Qy 


671 


Db 


301 


Qy 


731 


Db 


361 


Qy 


791 


Db 


421 


Qy 


851 


Db 


481 


Qy 


911 


Db 


541 



Qy 



971 CAAGCACCTTGGATTTGAATTAAA 994 



I I I I I I I I I I I I I I I I I I I M II 

Db 601 CAAG C AC CT TGGAT T T GAATT GAA 624 



RESULT 10 
AAV17816 

ID AAV17816 standard; cDNA; 2268 BP. 
XX 

AC AAV17816; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 

KW epithelial cell; proliferation; stimulation; treatment; tumours ; 

KW skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 

KW gastrointestinal tract; uterus; wound healing; transmembrane; ss. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualif iers 

FT CDS 69.. 2012 

FT /*tag= a 

FT /note= "don-1 polypeptide" 
XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US14585 . 
XX 

PR 19-NOV-1996; 96US-0753007 . 

PR 19-AUG-1996; 96US-0699591 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Bus field SJ, Gearing DP; 
XX 

DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48383. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of 

PT melanomas and adenocarcinoma (s) , and for wound healing 

XX 

PS Claim 4; Fig 7; 121pp; English. 
XX 

CC The sequence is that of a human don-1 gene splice variant. 

CC Don-1 polypeptides stimulate proliferation of epithelial cells 

CC and thus are implicated in melanomas and adenocarcinomas in which 

CC epithelial cells proliferate out of control. Compounds that 

CC interfere with don-1 mediated cell proliferation can be used 

CC in the treatment of tumours such as melanomas and adenocarcinomas 

CC of the skin, oesophagus, lung, breast, liver, pancreas, 

CC gastrointestinal tract, colon, prostate or uterus. Alternatively, 

CC don-1 polypeptides can be used to stimulate epithelial cell 

CC proliferation, e.g. for wound healing. 



XX 

SQ Sequence 2268 BP; 502 A; 735 C; 700 G; 331 T; 0 other; 



Query Match 49.5%; Score 492; DB 19; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 1.3e-109; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 CCTCGATACCAACGGCAAAAATCTCT^lGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 

I I I I I I I III I I I I I I I I I I I I M 

Db 98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

Qy 423 CGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 4 82 

I I I I I I I II I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 158 AGCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGT GAGAAGCA 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 218 AT C G CT GAAGT GT GAGGCAGC AGC C GGTAAT CC C CAG C CT T C CT AC C GTT GGT T CAAGGA 277 

Qy 543 TGGCAAGGAGCT CAACCGCAGCCGAGACATT CGCAT CAAAT ATGGCAACGGCAGAAAGAA 602 

II I I I I I II I I I I I I I I I I I I I I I I I I I I f I I I I I II I I M I I I I I I M I I I f I I I I I I I 

Db 278 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAAT ATGGCAACGGCAGAAAGAA 337 

Qy 603 CT C AC GACT ACAGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC 662 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 338 CT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CAC C CT GT CAT C CT GGT C G GGGC AC GCC C GGAAGT G C AACGAGACAGC CAAGT C CT AT TG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 458 CAC CCTGT CAT CCT GGT C GGGGCACGC C CGGAAGT GCAACGAGAC AGC CAAGT CCT ATT G 517 

Qy 7 83 CGT CAAT GGAGGC GT CT GCTACT ACAT C GAGGGC AT CAACC AGCT CT CCTGCAAAT GT CC 842 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT GGATT CTT CGGACAGAGAT GTTTGGAGAAACT GCCTTT GCGATTGTACAT GCCAGA 902 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 578 AAAT GGAT T CTT C GGAC AGAGAT GT TT GGAGAAACT GCCTT T GC GAT T GT AC AT GC CAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I II 

Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 11 
ABS56036 

ID ABS56036 standard; cDNA; 1474 BP. 
XX 

AC ABS56036; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human second splice variant of Don-1. 



XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualif iers 

FT CDS 68.. 1473 

FT /*tag= a 

FT /partial 

FT /product^ "Second splice variant of Don-1" 

FT /note= "This sequence lacks a stop codon" 

FT /transl_except= (pos : 107 . . 108 , aa:Lys) 

FT /note= "This codon has an apparent 1 nucleotide 

FT deletion which alters the reading frame" 

XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2000US-05997 89 . 
XX 

PA (GEAR/) GEARING D P. 
PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71639. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 

PS Claim 4; Fig 4; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins . 

CC Don-1 polypeptides are glycoprotein ligands . Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 18. 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell. 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 



CC in gene therapy. The present sequence encodes human second 

CC splice variant of Don-1. 

XX . 

SQ Sequence 1474 BP; 335 A; 472 C; 451 G; 216 T; 0 other; 

Query Match 49.4%; Score 491; DB 25; Length 1474; 

Best Local Similarity 95.3%; Pred. No. 2.1e-109; 



Matches 


506; Conservative 0; Mismatches 25; Indels 0; Gaps 


0; 


Qy 


388 


AAGAAAGAGGT GGGCAAGAT C CT GTGC ACT GACT GC GC CAC C C GGC CCAAGT T GAAGAAG 
1 1 1 1 1 1 1 1 II I E II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 


447 


Db 


191 


i ill ii i i ii i ■ ' ' i i i i i i i i i i i I ■ i i i ■ i i i i i i ■ 

AGGAAG C GGGAGAGGGAGCC CGAT C CC GGGGAGAAAG C CAC C C GGC CCAAGT T GAAGAAG 


180 


Qy 


448 


AT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAGCAAT C GCT GAAGT GT GAGGCAGCAGC C 


507 




1 1 1 1 1 1 1 1 1 I 1 E 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 E 1 1 1 I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 l 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 ftl 


A^GAAGAGrrAGACGGGACA GGT GGGT GAGAAGCAAT C GCT GAAGT GT GAGGCAGCAGC C 


240 


Qy 


508 


GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 


567 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 11 1 t 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 i l l l l l l l l l i I I i i i i i i i i i i i i i i i i i i i i i i i i i i i i ■ i i i i i i i i ■ ■ ■ > ■ 




Db 


OA 1 


r^HT A AT r r r r A r;r C TT r CT AC C GTT GGT T CAAGGAT GGCAAGGAGCT CAAC CGCAGC C GA 


300 


Qy 


568 


GACATT CGCAT CAAATAT GGCAACGGCAGAAAGAACT CAC GACT ACAGTTCAACAAGGT G 


627 




1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 E 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
l l l I I I i i i i i i i i i i i i i i i i i i i i i i i i i i ■ i i ■ i i ■ i i i ■ i i ■ i ■ i 1 ■ 1 1 1 1 1 • 1 ■ 1 




Db 




r A r A TT r cr AT r A A AT AT GGC A A C GGC AGAAAGAACT CAC GACT ACAGTT CAACAAGGT G 


360 


Qy 


628 


AAGGTGGAGGACGCT GGGGAGTATGTCT GCGAGGCCGAGAACAT CCT GGGGAAGGACACC 


687 




1 II 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
I i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i * i i i ■ i i i ■ > ■ < i i > ■ ■ < >■ ■ 




Db 


"3 61 

•J V X 


AAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC C GAGAACAT C CT GGGGAAGGACACC 


420 


Qy 


688 


GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 


747 




1 1 I 1 1 1 1 IE 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 




Db 


421 


GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 


480 


Qy 


748 


GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 


807 




I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 




Db 


481 


GC C C GGAAGT GCAAC GAGACAGCCAAGT CCT ATT GC GT CAAT GGAGGC GT CT GCT ACT AC 


540 


Qy 


808 


AT C GAGGGC AT CAAC CAGCT CT C CT GCAAAT GT CCAAAT GGAT T CT T CGGAC AGAGAT GT 


867 




I | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


541 


ATCGAGGGCAT CAAC CAGCT CT CCT GCAAAT GT CCAAAT GGATTCTT CGGAC AGAGAT GT 


600 


Qy 


868 


TT GGAGAAACT GCCTTT GCGATTGTACAT GCCAGAT CCTAAGCAAAGT GT C 918 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


601 


TT GGAGAAACT GC CTTT GC GAT T GT AC AT GCCAGAT C CT AAGCAAAAAGC C 651 





RESULT 12 
ABS56045 

ID ABS56045 standard; cDNA; 2266 BP. 
XX 

AC ABS56045; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA "encoding human third splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 



KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; . cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens . 



XX 

FH Key Location/Qualif iers 

FT CDS 68.. 2010 

FT /*tag- a 

FT /product^ "Third splice variant of Don-1" 

FT ' /transl_except= (pos : 107 . . 108 , aa:Lys) 

FT /note= "This codon has an apparent 1 nucleotide 

FT deletion which alters the reading frame" 

FT /transl_except= (pos: 994. .996, aa:Thr) 

XX 



PN US2002127594-A1, 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2000US-0599789 . 
XX 

PA (GEAR/) GEARING DP. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71644. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 

PS Claim 4; Fig 7; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins . 

CC Don-1 polypeptides are glycoprotein ligands . Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 18. 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell. 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes human third 

CC splice variant of Don-1. 

XX 



SQ Sequence 2266 BP; 502 A; 733 C; 700 G; 331 T; 0 other; 



Query Match 49.4%; Score 491; DB 25; Length 2266; 

Best Local Similarity 95.3%; Pred. No. 2.3e-109; 

Matches 506; Conservative 0; Mismatches 25; Indels 0; Gaps 0; 

Qy 38 8 AAGAAAGAGGT GGGCAAGAT CCT GT GC ACT GACT GC GC C AC C C GGCC CAAGTT GAAGAAG 447 

I I I I I 1 I I III I II I I I I I I I I hi I I I I I I I I I I I I I I 

Db 121 AGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAAAGCCACCCGGCCCAAGTTGAAGAAG 18 0 

Qy 44 8 ATGAAGAGCCAGACGGGACAGGT GGGTGAGAAGCAAT CGCT GAAGTGTGAGGCAGCAGCC 507 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 ATGAAGAGC CAGACGGGACAGGT GGGT GAGAAGCAAT CGCT GAAGT GT GAGGCAGCAGC C 24 0 

Qy 508 GGTAAT C CC CAGC CT T C CT ACC GT T GGT T CAAGGAT GGCAAGGAGCT CAAC CGCAGCCGA 567 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 241 GGTAAT CCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 300 

Qy 568 GAC AT T C GCAT CAAAT AT GGCAAC GGCAGAAAGAACT C ACGACT AC AGT T CAACAAGGT G 627 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GACATTC GCAT CAAAT AT GGCAAC GGCAGAAAGAACT CACGACT ACAGTT CAACAAGGT G 360 

Qy 62 8 AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 687 

I II I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 AAGGT GGAGGAC GCT GGGGAGT AT GT CT GCGAGGC C GAGAACAT C CT GGGGAAGGAC AC C 420 

Qy 68 8 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 747 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 480 

Qy 74 8 GCCCGGAAGTGC7VACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 8 07 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 G CC C GGAAGT GCAAC GAGACAGC CAAGT CCT ATT GC GT CAAT GGAGGC GT CTGCT ACT AC 540 

Qy 8 08 ATCGAGGGCAT CAACCAGCT CT CCT GCAAATGTCCAAAT GGATT CTT CGGACAGAGAT GT 8 67 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I 
Db 541 AT C GAGGGCAT CAACCAGCT CT C CT GCAAAT GT CCAAAT GGATT CTT C GGACAGAGAT GT 600 

Qy 868 TTGGAGAAACT GC CT T T G C GATT GT AC AT GC C AGAT CCTAAGCAAAGT GT C 918 

I I I I I I I I I I I I I II I I 1 I I M I I I I I I I I M I I I I II I I I I I I II I I 
Db 601 TTGGAGAAACTGCCTTTGCGATTGTACATGCCAGATCCTAAGCAAAAAGCC 651 



RESULT 13 
AAV17815 

ID AAV17815 standard; cDNA; 1476 BP. 
XX 

AC AAV17815; 
XX 

DT 17-AUG-1998 ( first . entry) 
XX 

DE Homo sapiens don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 

KW epithelial cell; proliferation; stimulation; treatment; tumours; 

KW skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 

KW gastrointestinal tract; uterus; wound healing; transmembrane; ss. 



XX 






OS 


Homo sapiens. 




XX 






FH 


Key 


Location/Qualifiers 


FT 


CDS 


69. . 1475 


FT 




/*tag= a 


FT 




/note= "don-1 polypeptide" 


XX 






PN 


WO9807736-A1. 




XX 






PD 


26-FEB-1998. 




XX 






PF 


18-AUG-1997; 


97WO-US14585. 


XX 






PR 


19-NOV-1996; 


96US-0753007. 


PR 


19-AUG-1996; 


96US-0699591. 


XX 






PA 


(MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 


XX 






PI 


Bus field SJ, 


Gearing DP; 


XX 






DR 


WPI; 1998-169084/15. 


DR 


P-PSDB; AAW48382. 


XX 






PT 


Mouse and human don-1 polypeptide ( s ) - useful for treatment of 


PT 


melanomas and 


adenocarcinoma (s ) , and for wound healing 


XX 






PS 


Claim 4; Fig 


4; 121pp; English.. 


XX 






cc 


The sequence 


is that of a human don-1 gene splice variant. 


cc 


Don-1 polypeptides stimulate proliferation of epithelial cells 


cc 


and thus are 


implicated in melanomas and adenocarcinomas in which 


cc 


epithelial cells proliferate out of control. Compounds that 


cc 


interfere with don-1 mediated cell proliferation can be used 


cc 


in the treatment of tumours such as melanomas and adenocarcinomas 


cc 


of the skin, 


oesophagus, lung, breast, liver, pancreas, 


cc 


gastrointestinal tract, colon, prostate or uterus. Alternatively, 


cc 


don-1 polypeptides can be used to stimulate epithelial cell 


cc 


proliferation 


, e.g. for wound healing. 


XX 






SQ 


Sequence 1476 


BP; 335 A; 475 C; 450 G; 216 T; 0 other; 



Query Match 49.3%; Score 490.4; DB, 19; Length 1476; 

Best Local Similarity 92.6%; Pred. No. 2.9e-109; 

Matches 515; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 



Qy 363 CCTCGATACCAACGGCAAAAATCT CAAGAAAGAGGTGGGCAAGAT CCT GTGCACT GACTG 422 

I I I I I II I II I I I I I I I I II I I II 

Db 98 C C GC GGCAAGAAGCAC C C AGAGGGGAGGAAGC GGGAGAGGGAG CC C GAT C C C GGGGAGAA 157 



Qy 423 C GC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCA 482 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 158 AGCCACCCGGCC CAAGTT GAAGAAGATGAAGAGCCAGACGGGACAGGT GGGTGAGAAGCA 217 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I 

Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 



Qy 

Db 



543 
278 



602 
337 



Qy 603 CT C AC GACT ACAGT T CAACAAGGT GAAGGT G GAGGAC GCT GGGGAGT AT GTC T GC GAGGC 662 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I E I I I I I I I I I I I I I I M I I 
Db 338 CT C AC GACT ACAGT T CAACAAG GT GAAG GT GGAGGACGCT GGGGAGT AT GT CT GC GAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II I I II I I I I II I I I I I I I I I I I 

Db 398 C GAGAACAT CCT GGGGAAGGACACC GT C C GGGGC CGGCT TT ACGT CAAC AGC GT GAGCAC 457 

Qy 723 CAC C CT GT CAT CCT GGT CGGGGCAC GC CC GGAAGTGCAAC GAGACAGC CAAGT C CT AT TG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 458 CACC CT GT CAT C CT GGT CGGGGCAC GC C C GGAAGTGCAAC GAGACAGC CAAGT CCT AT TG 517 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 518 C GT CAAT GGAGGC GT CT GCT ACT ACAT C GAGGGC AT CAACC AGCT CT C CT GCAAAT GT C C 577 

Qy 843 AAAT GGAT T CTT C GGACAGAGAT GT TTG GAGAAACT GC CT TT GCGAT T GT AC AT GCCAGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I r 
Db 57 8 AAAT GGATT CTT CGCACAGAGAT GTTTGGAGAAACTGCCTTT GCGATTGTACAT GCCAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I 1 I I I I I I I 

Db 638 T CCT AAGC AAAAAG C C 653 



RESULT 14 
AAV17 812 

ID AAV17812 standard; cDNA; 2467 BP. 
XX 

AC AAV17 812; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Mus musculus don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 

KW epithelial cell; proliferation; stimulation; treatment; tumours; 

KW skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 

KW gastrointestinal tract; uterus; wound healing; transmembrane; ss. 

XX 

OS Mus musculus. 
XX 

FH Key Location/Qualifiers 
FT CDS 79.. 1896 

FT /*tag= a 

FT /note= "transmembrane don-1 polypeptide" 

XX 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US14 585 . 



XX 

PR 19-NOV-1996; 96US-0753007 . 

PR 19-AUG-1996; 96US-0699591 . 
XX 

PA (MILL- ) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Busfield SJ, Gearing DP; 
XX 

DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48379. 
XX 

PT Mouse and human don-1 polypeptide ( s ) - useful for treatment of 

PT melanomas and adenocarcinoma ( s ) , and for wound healing 

XX 

PS Claim 4; Fig 1; 121pp; English. 
XX 

CC The sequence is that of a murine don-1 gene splice variant. 

CC Don-1 polypeptides stimulate proliferation of epithelial cells 

CC and thus are implicated in melanomas and adenocarcinomas in which 

CC epithelial cells proliferate out of control. Compounds that 

CC interfere with don-1 mediated cell proliferation can be used 

CC in the treatment of tumours such as melanomas and adenocarcinomas 

CC of the skin, oesophagus, lung, breast, liver, pancreas, 

CC gastrointestinal tract, colon, prostate or. uterus. Alternatively, 

CC don-1 polypeptides can be used to stimulate epithelial cell 

CC proliferation, e.g. for wound healing. 

XX 

SQ Sequence 2467 BP; 592 A; 752 C; 706 G; 417 T; 0 other; 

Query Match 46.7%; Score 464.6; DB- 19; Length 2467'; 

Best Local Similarity 91.0%; Pred. No. 6.2e-103; 

Matches 494; Conservative 0; Mismatches 49; Indels 0; Gaps 0; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 CT AAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GC GC CACC C 61 

Qy 431 GGC CCAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAGCAAT C GCT GA 490 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I Mill I 
Db 62 . GGCC CAAG CT GAAGAAGAT GAAGAGCCAGAC AGGAGAGGT GGGT GAGAAGCAGT CGCT CA 121 

Qy 4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I II II II II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCT CAACCGCAGC C GAGAC AT T C GC AT CAAAT AT GGCAAC GGC AGAAAGAACT CAC GAC 610 

I I I I I I I I I II I I II I I I I I I I I I I I 111 I I I I I I I I I I I I I I I I I 

Db 182 AACT CAAC C GGAGT C GT GAT ATT C GC AT CAAGT AT GGCAAT GT C AGAAAGAACT CAC GGC 241 

Qy 611 TACAGTTCAACAAGGT GAAGGT GGAGGACGCT GGGGAGT ATGT CT GCGAGGCCGAGAACA 670 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 242 T ACAGTT CAACAAAGT GAGC GT GGAGGAT GC C G GGGAGT AC GT CT GT GAGGC C GAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 302 T C CTT GGGAAGGAC AC C GT GAGGGGC C GACT C CATGT CAACAGC GT GAC CAC CACT CT GT 361 



Qy 731 CAT C CT GGT C GGGGCAC GC C C GGAAGT GCAAC GAGAC AGC CAAGT C CT AT T GC GT CAAT G 790 

I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I 
Db 362 CAT C CT GGT CGGGACAT GC C C GGAAGTGCAAT GAGAC CGC CAAGT C CT ACT GT GT GAAT G 421 

Qy 791 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCTUVATGGAT 850 

I I I I I E I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 422 GAG GC GT GT GCT ACTACAT CGAGGGC AT CAAC C AGCT CT CCT G CAAAT GT C CAAAC GGAT 481 

Qy 851 T CTT C GGACAGAGAT GTT T GGAGAAACT GC CTT T GC GAT T GTACAT GC C AGAT C CT AAGC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 4 82 T CTT CGGACAGAGAT GTTT GGAGAAACT GC CTT T GC GAT T GTACAT GC C AGAT C CT AAGC 541 

Qy 911 AAA 913 

I I I 

Db 542 AAA 544 



RESULT 15 
ABS56033 

ID ABS56033 standard; cDNA; 2442 BP. 
XX 

AC ABS56033; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding murine membrane-bound splice variant of Don-1. 
XX 

KW Murine; Don-1; epidermal growth factor; EGF; neuregulin; mouse; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; chromosome 18; gene; ss. 

XX 

OS Mus sp. 
XX 

FH Key Location/Qualif iers 

FT CDS 78.. 1895 

FT /*tag= a 

FT /product= "Membrane-bound splice variant of Don-1" 
XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2000US-0599789 . 
XX 

PA (GEAR/) GEARING DP. 

PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71636. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 



PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 

XX 

PS Claim 4; Fig 1; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins . 

CC Don-1 polypeptides are glycoprotein ligands . Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 18. 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell . 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a. sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes murine membrane-bound 

CC splice variant of Don-1. 

XX 

SQ Sequence 2442 BP; 587 A; 742 C; 703 G; 410 T; 0 other; 

Query Match 45.9%; Score 455.8; DB 25; Length 2442; 

Best Local Similarity 91.2%; Pred. No. 8.4e-101; 

Matches 495; Conservative 0; Mismatches 47; Indels 1; Gaps 1; 

Qy 371 C CAACGGCAAAAAT CT CAAGAAAGAG GT GGGCAAGAT C CT GT GCACT GACT GCGC C AC C C 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II 
Db 2 CTAACGGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GACT GCGC CA- C C 60 

Qy 431 GGCCCAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGCAAT C GCT GA 490 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I El! I I I I I I I I I I I I I I I I MEM I 
Db 61 GGCCCAAGCT GAAGAAGAT GAAGAGC CAGACAGGAGAGGT GGGT GAGAAGC AGT C GCT CA 12 0 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II II II II I I I I I I I I I I I II I II I I I I I I I II II M I I I 
Db 121 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 180 

Qy 551 AGCT CAAC C GCAGC C GAGAC ATT C GCAT CAAAT AT GGCAAC GGCAGAAAGAACT C AC GAC 610 

I I I I I I II I II II II I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I I I I 

Db 181 AACT CAAC CGGAGT C GT GAT ATT C GCAT CAAGT AT GGCAAT GT CAGAAAGAACT C ACGGC 24 0 

Qy 611 T ACAGT T CAACAAGGT GAAG GT G GAGGACGCT G GGGAGT AT GT CT GC GAG GC CGAGAAC A 670 

I I I I I I I I I I I I I I I I I MINIMI II I I I I II I I I II I I I II I I II I I I I I I 
Db 241 T AC AGT T CAACAAAGT GAGG GT GGAGGATGCC GGGGAGT AC GT CT GT GAGGC CGAGAAC A 300 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

MM II I II I I I I I I I I I I I I I II I II I II I I I I II II I I I II II I I I II I I 
Db. 301 TCCTTGGGAAGGACACCGTGAGGGGCC GACT C CAT GTCAACAGCGTGAGCACCACTCTGT 360 

Qy 731 CAT C CT GGT C GGGGCAC GC C C GGAAGT G CAAC GAGACAGC CAAGT C CT AT T GCGT CAAT G 7 90 

I I II I I I II I I I I II I I I I II I I I I I I I I I I II I I I I I II I I II I II I I I I I I 
Db 361 CAT C CT G GT CGGGACAT GC C C GGAAGT GCAAT GAGACC GC CAAGT C CT ACT GT GT GAAT G 420 



Qy 791 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 850 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I > I 

Db 421 GAGGCGT GT GCT ACT ACAT C GAG GGCAT CAAC CAGCT CT C CT GCAAAT GT CCAAAC G GAT 48 0 

Qy 851 T CT T CGGAC AGAGAT GT TT GGAGAAACT G C CT T T GCG AT T GT AC AT GC C AGAT C CT AAG C 910 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I 
Db 481 TCT T CGGACAGAGAT GT TT GGAGAAACT GC CT T T GCGAT T GT AC AT GC C AGAT C CT AAGC 54 0 

Qy 911 AAA 913 

I I I 

Db 541 AAA 543 



Search completed: January 14, 2004, 08:16:37 
Job time : 324.696 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: January 14, 2004, 07:16:01 ; Search time 73.5907 Seconds 

(without alignments) 
5961.825 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-09-864-675-1 
994 

1 atgaggcgcgacccggcccc. 



. caccttggatttgaattaaa 994 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 569978 seqs, 220691566 residues 

Total number of hits satisfying chosen parameters: 1139956 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_NA: * 

1: /cgn2__6/ptodata/2/ina/5A_COMB.seq:* 

2 : /cgn2_6/ptodata/2/ina/5B_COMB. seq: * 

3 : /cgn2_6/ptodata/2/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_COMB. seq: * 

5 : /cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 

6 : / cgn2_6/ptodata/2/ina/backf ilesl . seq: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


900 


90. 


5 


1884 


3 


US- 


08 


-753- 


007A-5 


Sequence 


5, 


Appli 


2 


900 


90. 


5 


1884 


3 


us- 


09 


-398- 


496-5 


Sequence 


5, 


Appli 


3 


881 ' 


88. 


6 


993 


2 


us- 


08 


-525- 


864A-3 


Sequence 


3, 


Appli 


4 


738.6 


74. 


3 


3441 


2 


us- 


08 


-525- 


8 64A-1 


Sequence 


1, 


Appli 


5 


547.2 


55. 


1 


1607 


3 


us- 


08 


-753- 


007A-3 


Sequence 


3, 


Appli 


6 


547.2 


55. 


1 


1607 


3 


us- 


09 


-398- 


496-3 


Sequence 


3, 


Appli 


7 


492 


49. 


5 


1476 


3 


us- 


08 


-753- 


007A-7 


Sequence 


7, 


Appli 


8 


492 


49. 


5 


1476 


3 


us- 


09 


-398- 


496-7 


Sequence 


7, 


Appli 


9 


492 


49. 


5 


2268 


3 


US- 


08 


-753- 


007A-31 


Sequence 


31, 


Appl 


10 


492 


49. 


5 


2268 


3 


US- 


09 


-398- 


496-31 


Sequence 


31, 


Appl 


11 


467.8 


47. 


1 


2467 


3 


us- 


08 


-753- 


007A-1 


Sequence 


1, 


Appli 



12 


467 


. 8 


47 . 


1 


2467 


3 


us- 


on o n o 

09-39 8 


-4 96-1 


Sequence 


1, Appli 


13 


359 


. 6 


36 . 


2 


1207 


2 


us- 


08-52 5 


-8 64 A- 5 


Sequence 


5, Appli 


14 


135 


. 6 


13 . 


6 


142 


2 


us- 


r\ o d O C 


O t~ A 7\ -| (~> 

-864A-18 


Sequence 


lb, 


Appl 


15 


95 


. 4 


9 . 


6 


1140 


1 


us- 


no n o cl 
(Jo-U3b 


-5boB-149 


Sequence 


1 A O 

14 y , 


App 


16 


95 


. 4 


9 . 


6 


1140 


1 


us- 


Uo-4 by 


-569-149 


Sequence 


1 A Q 

14 y , 


App 


17 


95 


. 4 


9 . 


6 


1140 


1 


us- 


08-249 


-322A-149 


Sequence 


14 y , 


App 


18 


95 


. 4 


9 . 


6 


1140 


1 


us- 


08-469 


-526A-149 


Sequence 


149, 


App 


19 


95 


. 4 


9 . 


6 


1140 


2 


us- 


Oo- /34 


-59 1 A- 149 


Sequence 


149, 


App 


20 


95 


. 4 


9 . 


6 


1140 


2 


us- 


08-469 


-660-149 


Sequence 


149, 


App 


21 


95 


. 4 


9 . 


6 


1140 


3 


us- 


08-341 


-018-55 


Sequence 


55 , 


Appl 


22 


95 


. 4 


9 . 


6 


1140 


3 


US- 


08-470 


-335-149 


Sequence 


149, 


App 


23 


95 


. 4 


9 . 


6 


1140 


3 


us- 


08-735 


-021-149 


Sequence 


149, 


App 


24 


95 


. 4 


9 . 


6 


1140 


3 


us- 


08-734 


-664A-149 


Sequence 


149, 


App 


25 


95 


. 4 


9 . 


6 


1140 


3 


us- 


08-470 


-339-149 


Sequence 


149, 


App 


26 


95 


. 4 


9 . 


6 


1140 


4 


us- 


08-467 


-602-149 


Sequence 


149, 


App 


27 


95 


. 4 


9 . 


6 


1140 


5 


PCT 


-US94- 


05083C-145 


Sequence 


145, 


App 


28 


95 


. 4 


9 . 


6 


1140 


5 


PCT 


-US 95- 


06846A-149 


Sequence 


149, 


App 


29 


93 


. 4 


9 . 


4 


1193 


1 


us- 


08-469 


-52 6A-134 


Sequence 


134, 


App 


30 


93 


. 4 


9 . 


4 


1193 


2 


us- 


08-734 


-591A-134 


Sequence 


134, 


App 


31 


93 


. 4 


9 . 


4 


1193 


3 


us- 


08-341 


-018-3 


Sequence 


3, Appli 


32 


93 


. 4 


9 . 


4 


1193 


3 


us- 


08-470 


-335-134 


Sequence 


134, 


App 


33 


93 


. 4 


9 . 


4 


1193 


3 


us- 


08-735 


-021-134 


Sequence 


134, 


App 


34 


93 


. 4 


9 . 


4 


1193 


3 


us- 


08-734 


-664A-134 


Sequence 


134, 


App 


35 


93 


. 4 


9 . 


4 


1193 


3 


us- 


08-470 


-339-134 


Sequence 


134, 


App 


36 


93 


. 4 


9 . 


4 


1193 


4 


us- 


08-467 


-602-134 


Sequence 


134, 


App 


37 


91 


. 8 


9 . 


2 


1193 


1 


US" 


08-036 


-555B-134 


Sequence 


134, 


App 


38 


91 


. 8 


9 . 


2 


1193 


1 


us- 


08-469 


-569-134 


Sequence 


134, 


App 


39 


91 


. 8 


9 . 


2 


1193 


1 


us- 


08-249 


-322A-134 


Sequence 


134, 


App 


40 


91 


. 8 


9. 


2 


1193 


2 


us- 


08-469 


-660-134 


Sequence 


134, 


App 


41 


91 


. 8 


9. 


2 


1193 


5 


PCT 


-US94- 


05083C-130 


Sequence 


130, 


App 


42 


91 


.8 


9. 


2 


1193 


5 


PCT 


-US95- 


06846A-134 


Sequence 


134, 


App 


43 


77 


.4 


7. 


8 


2003 


1 


us- 


08-036 


-555B-21 


Sequence 


21, 


Appl 


44 


77 


.4 


7. 


8 


2003 


1 


us- 


08-469 


-569-21 


Sequence 


21, 


Appl 


45 


77 


.4 


7. 


8 


2003 


1 


us- 


08-249 


-322A-21 


Sequence 


21, 


Appl 



ALIGNMENTS 



RESULT 1 

US-08-753-007A-5 

; Sequence 5, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

; APPLICANT: Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES : 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 



MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/753 , 007A 

FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR SEQ ID NO: 5: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1884 base pairs 

; TYPE: nucleic acid 

\ ; STRANDEDNESS: single 

; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 
; FEATURE : 

; NAME/KEY: Coding Sequence 

; LOCATION: 664... 18 83 

; OTHER . INFORMATION : 

US-08-753-007A-5 



Query Match 90.5%; Score 900; DB 3; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 3.5e-223; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 



Qy 61 T ACT C GCC CAGC CT CAAGT C AGT GCAGGAC C AGGCGT ACAAGGC ACCC GT GGT GGTGGAG 120 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db .27 8 TACT C GCC CAGC CTCAAGT C AGT GCAGGAC CAGGCGT ACAAGGCACC C GT GGT GGTGGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I II I I I I I I I I I II I I I I I I I I I I I I 

Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 



QY 



301 



CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 



360 



Db 



518 



577 



Qy 361 C C C CT C GAT AC CAAC GG CAAAAAT CT CAAGAAAGAGGT G GGCAAGAT CCT GT GCACT GAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 57 8 CCCCT - GAT ACCAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGATCCT GT GCACT GGC 63 6 

Qy 421 T GC GC CAC C C G GCC CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 637 TGC GC CAC C C GGC C CAAGTT GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAG 696 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GAT GGCAAGGAGCT CAAC C GC AGC C GAGACAT T C GC AT CAAAT AT GGCAAC GGCAGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 757 GAT GGCAAGGAGCT CAAC C GCAGC C GAGACAT T C GCAT CAAAT AT GGCAAC GGCAGAAAG 816 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTAT GT CT GCGAG 660 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I 
Db 817 AACT CAC GACTACAGTT CAACAAGGT GAAGGT GGAGGAC GCTGGGGAGTAT GT CT GCGAG 876 

Qy 661 GCC GAGAACAT CCT GGGGAAGGACAC C GT C C GGGGC CGGCT TT ACGT CAACAGC GT GAGC 720 

I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Qy 721 AC CAC CCT GT CAT C CT GGT C GGGGCAC GCCC GGAAGTGCAACGAGACAGC CAAGT C CT AT 78 0 

I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 937 AC CAC CCT GT CAT C CT GGT C GGGGC AC GCC C GGAAGTGCAACGAGACAGC CAAGT C CT AT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

Qy 841 CCAAATGGATT CTT CGGAC AGAGAT GT T T GGAGAAACT GCCTTT GC GATT GT ACAT GC C A 900 

I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1057 CCAAATGGATTCTT C GGAC AGAGAT GT T T GGAGAAACT GCCTTT GC GATT GT ACAT GC C A 1116 

Qy 901 GATCCTAAGCAAAGTGTCCT 920 

I I I I I I I I I II II III 
Db 1117 GATCCTAAGCAAAAGCACCT 1136 



5 

Application US/09398496 
Patent No. 6133423 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 



RESULT 2 
US-09-398-496- 
; Sequence 5, 



; CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/398 , 4 96 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER : 08/753,007 

; FILING DATE: 19-NOV-1996 

; APPLICATION NUMBER: 08/699,591 

; FILING DATE: 19-AUG-1996 

ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 

; TELEX: 

; INFORMATION FOR SEQ ID NO: 5: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 1884 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
FEATURE : 

; NAME/ KEY: Coding Sequence 

LOCATION: 664... 1883 

OTHER INFORMATION: 
US-09-398-496-5 

Query Match 90.5%; Score 900; DB 3; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 3.5e-223; 

Matches 914; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACT C GC CC AGC CT CAAGT C AGT GC AG GAC CAGGC GT ACAAGGCAC C C GT GGT GGT GGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 27 8 TACT CGCCCAGC CT CAAGT CAGT GC AGGAC CAGGC GT ACAAGGCAC CC GT GGT GGT GGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 



Qy 



181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 



240 



Db 



398 



CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 



Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 C CC CT C GAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 57 8 C CC CT - GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GGC 636 

Qy 421 T GC GC CAC C C GGC C CAAGT T GAAGAAGAT GAAGAGCCAGAC GGGACAGGT GGGT GAGAAG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 637 T GC GCC AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAG 696 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I II I I I I I I I II I II I M I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GAT GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCAT CAAATATGGCAACGGCAGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 757 GAT GGCAAGGAGCT CAAC CGCAG C C GAGAC ATT C GCAT CAAAT AT GGCAAC GGCAGAAAG 816 

Qy 601 AACT CACGACTACAGTT CAACAAGGT GAAGGTGGAGGACGCTGGGGAGTAT GTCT GCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 817 AACT CACGACTACAGTT CAACAAGGTGAAGGTGGAGGACGCTGGGGAGTAT GT CT GCGAG 87 6 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Qy 721 ACCAC C CT GT CAT C CT GGT C GGGGC ACGC C C GGAAGT GCAACGAGACAGC CAAGT C CT AT 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 937 ACCAC C CT GT CAT C CT GGT C GGGGCACGC C CGGAAGTGCAACGAGACAG C CAAGT CCT AT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

Qy 841 C CAAAT GGATT CTT CGGAC AGAGAT GTT T GGAGAAACT GC CTT T GC GAT T GT ACAT GC CA 900 

I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 1057 C CAAAT GGATT CTT CGGAC AGAGAT GTT T GGAGAAACT GCCTT T GC GAT T GT ACAT GCCA 1116 

Qy 901 GAT C CT AAGCAAAGT GT C CT 92 0 

I I I I I I I I I I I I I III 
Db 1117 GATCCTAAGCAAAAGCACCT 1136 



RESULT 3 

US-08-525-864A-3 

; Sequence 3, Application US/08525864A 
; Patent No. 5912326 
; GENERAL INFORMATION: 

APPLICANT: Chang, Han 



TITLE OF INVENTION: Cereb ell urn- derived Growth Factors, and Uses 
; TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
; STATE: Massachusetts 

; COUNTRY: USA 

ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 
FILING DATE: 8-SEP-1995 
; CLASSIFICATION: 530 

; ATTORNEY/AGENT INFORMATION: 
; NAME: Kara, Catherine J. 

; REGISTRATION NUMBER: 41,106 

REFERENCE/ DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 
TELEPHONE:' (617)227-74 00 
TELEFAX: (617)742-4214 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 993 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: CDS 
LOCATION: 1. . 990 
US-08-525-864A-3 

Query Match 88.6%; Score 881; DB 2; Length 993; 

Best Local Similarity 93.0%; Pred. No. 2.3e-218; 

Matches 923; Conservative 0; Mismatches 70; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I II I I I I I I I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 18 0 

Qy 181 

Db 181 



CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 
I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 



Qy 

Db 



241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 300 



Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

■ Db 301 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 360 

Qy 361 C CC CT C GAT AC CAAC G GCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 420 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 C C GGT C GAC C CT AAC GG CAAAAACAT CAAGAAAGAGGT GGGCAAGAT C C T GT GC ACT GAC 420 

Qy 421 T GC G C CACC C GG CC CAAGT T GAAGAAGAT GAAGAGCC AGAC GGGACAGGT GG GT GAGAAG 4 80 

I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 

Db 421 T GC GCAACC C GGC C CAAGCT GAAGAAGAT GAAGAGT C AGACAGGAGAGGT GGGCGAGAAG 4 80 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II I I I I I I I I I I I I I I I I II II II II I I I I I I I I II I I I II I I I I I I I I I 

Db 481 CAGTCGCT CAAGT GTGAGGC GGC GGC GGGGAACCCCCAGCCCTCCTATC GAT GGT TCAAG 540 

Qy 541 GAT GGCAAGGAGCT CAACC G CAGC G GAGACAT T C GC AT CAAAT ATGGCAAC GGC AGAAAG 600 

II I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 GAC GGCAAGGAG CT CAAC C GGAGT C GT GAC AT T C GC AT CAAGT AT GGCAAC GGC AGAAAG 600 

Qy 601 AACT C AC GACT ACAGT T CAACAAGGTGAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 660 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I III 

Db 601 AACT CAC GG CT AC AGT T CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGT AC GT CT GT GAG 660 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 720 

Qy 721 AC CAC C CTGT CAT C CT GGT C GGG GC AC GC C CGGAAGT GCAAC GAGACAGC CAAGT CCT AT 7 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 7 80 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 840 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1.1 I I I I I I I I I I I 

Db 781 T GTGTGAAT GGAGGCGTGT GCTACTACAT CGAAGGCATCAACCAACT CTCCT GCAAATGT 840 

Qy 841 C CAAAT GGAT T CT T C GGACAGAGAT GTT T GGAGAAACT GCCTT T GC GATT GT AC AT G CCA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 841 C CAAAC GGAT T CT T C GGACAGAGAT GTT T GGAGAAACT GCCTT T GC GATT GT AC AT GCCA 900 

Qy 901 GAT C CT AAGCAAAGT GTC CT GT GGGAT ACACC GG GGACAGGT GT CAGC AGT T C GCAAT GG 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 GAT C CT AAG CAAAGT GTC CT GT GG GAT ACACC GG GGACAGGT GT CAGC AGT T C GCAAT GG 960 

Qy 961 TCAACTTCTCCAAGCACCTTGGATTTGAATTAA 993 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 T CAACT T CT C CAAGC ACCT T GGAT TT GAATTAA 993 



RESULT 4 

US-08-525-864A-1 

; Sequence 1, Application US/08525864A 



; Patent No. 5912326 
; GENERAL INFORMATION: 

APPLICANT: Chang, Han 
; TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
; STATE: Massachusetts 

COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/525, 864A 
; FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
; NAME: Kara, Catherine J. 

REGISTRATION NUMBER: 41,106 
REFERENCE/DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (617)227-7400 
TELEFAX: (617)742-4214 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 3441 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: double 
; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 

FEATURE: 

NAME/ KEY: CDS 
LOCATION: 180.. 2441 
US-08-525-864A-1 

Query Match 74.3%; Score 738.6; DB 2; Length 3441; 

Best Local Similarity 90.4%; Pred. No. 2.2e-181; 

Matches 78 9; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I U I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 18 0 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 240 T ACT C GC C CAGC CT CAAGT C C GT GCAGGAC CAG G CGT ACAAGGC AC CC GT GGT GGT GGAG 299 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I III II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 300 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 359 



Qy 

Db 



181 
360 



240 
419 



Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 420 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 479 

Qy 3.01 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 480 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 539 

Qy 361 C C C CT C GAT AC CAACGGCAAAAAT CT CAAGAAAGAGGT GG GCAAGAT C CT GT GCACT GAC 42 0 

II I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 540 C C GGT C GAC CCT AACGGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GAC 599 

Qy 421 T GC GC CAC C C GGC CCAAGTT GAAGAAGAT GAAGAGCCAGAC GGGACAGGT GGGT GAGAAG 4 80 

I I I I I MINIMUM I I I I I I I I I I I I I I II I I I I I III I I I I I I I I I I I I I 

Db 600 T GC GCAAC C C GGCC CAAGCT GAAGAAGAT GAAGAGT CAGACAGGAGAGGT GGGC GAGAAG 659 

Qy 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

II I I I I I I I I I I I I I I II II II II II I I I I I I I I I I I I I II I I I I I I I I I 

Db 660 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 719 

Qy 541 GAT GGCAAGGAGCT CAAC C GC AGC C GAGACATT C GCAT CAAAT AT GGCAAC GGCAGAAAG 600 

II I I I I I I I I I I I I I I I I I II II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 720 GAC GGCAAGGAGCT CAACCGGAGT CGT GACATTCGCATCAAGT ATGGCAACGGCAGAAAG 77 9 

Qy 601 AACT CAC GACT ACAGT T CAACAAGGT GAAGGTGGAGGACGCT GGGGAGT AT GT CT GC GAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 7 80 AACT CAC GGCT AC AGT T CAACAAAGT GAAGGTGGAGGAC GCT GGAGAGT AC GT CT GT GAG 839 

Qy 661 GC C GAGAAC AT CCT GGGGAAGGAC ACC GT C C GGGG C C GGCTT T AC GT CAACAGCGT GAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I J I I I 
Db 8 40 G CT GAGAACAT C CT T GGGAAGGAC ACT GT GAGGGGC C GGCT C CAT GT CAACAGTGT GAGC 899 

Qy 721 ACC ACC CT GT CAT CCT GGT C GGGGCAC GC CCGGAAGT GCAAC GAGAC AGC CAAGT CCTAT 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 900 AC C ACT CT GT C GT CCT GGT C GGGGCAC GC CC GGAAGT GCAAT GAGACAGC CAAGT C CTAC 959 

Qy 7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 960 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAGTGT 1019 

Qy 841 C CAAAT GGATT CT T CGGACAGAGAT GT T T GGAG 873 

II I I I I I I II I I I I I I III 

Db 1020 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 1052 



RESULT 5 

US-08-753-007A-3 

; Sequence 3, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT : Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 



TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 
; COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ Version 2.0 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/753 , 007A 

FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 3: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1607 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/ KEY : Coding Sequence 
LOCATION: 79... 621 
OTHER INFORMATION: 
US-08-753-007A-3 

Query Match 55.1%; Score 547.2; DB 3; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 4.7e-132; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 
Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 




Db 



2 CT AACGGCAAAAAC AT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GC GC C AC C C 61 



QY 



431 GGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCAATCGCT GA 4 90 





Db 



62 GGCCCAAGCT GAAGAAGAT GAAGAGCCAGACAGGAGAGGT GGGT GAGAAGCAGTCGCT CA 121 



QY 
Db 



491 
122 



550 
181 



Qy 

Db 



551 
182 



610 



241 



Qy 611 T AC AGT T CAACAAGGT GAAGGT GGAGGAC G CT GGGGAGT AT GT CT GCGAGGC C GAGAACA 670 

I I I I I I I I I I II I I I I I MINIMI || I I I M I I I INN I I I I I I I I I I I I I 
Db 242 T AC AGTT CAACAAAGT GAGGGT GGAGGAT GC C G G GGAGT AC GT CT GT GAGGC CGAGAAC A 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I 
Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CAT C CT GGT C GGGGCAC GC C C GGAAGT GCAAC GAGAC AGCCAAGT C CTATT GC GT CAAT G 790 

I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II || MM 
Db 362 CAT C CT GGT C GGGAC AT GC CC GGAAGT GCAAT GAGAC C GC CAAGT C CT ACT GT GT GAAT G 421 

Qy 791 GAGGCGT CT GCTACT ACAT CGAGGGCAT CAACCAGCTCTCCT GCAAATGTCCAAAT GGAT 850 

I I I I I I I I I II I I I M I I I I I M I I I I M II I II I M I M II I I I M I I M I I I I I I I 

Db 422 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 481 

Qy 851 T CT T CGGAC AGAGAT GTTT GGAGAAACT GC CT TT GC GATT GT AC AT GCC AGAT C CTAAGC 910 

I M I I I 11 M I I I I I II I M I II II I I II I II II I I I II I II I I I I I I I II I I II II II I 
Db 4 82 T CTT CGGAC AGAGAT GT TT GGAGAAACT GC CTTT GC GATT GT ACAT GC C AGAT C CTAAGC 541 

Qy 911 AAAGTGT C CT GT GGGAT ACACC GGGGACAGGT GT C AGC AGTT C GCAAT GGT CAACT T CT C 97 0 

I I I II II I II I I II I II I I I II II II I I II II II I I I II II II II II II II II I II I I II 
Db 542 AAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 601 

Qy 971 CAAGCAC CT TGGAT T T GAATTAAA 994 

I I I II I I I I II I I I I I I II I I II 
Db 602 CAAGC ACCT TGGATT T GAAT T GAA 625 



RESULT 6 
US-09-398-496-3 

Sequence 3, Application US/09398496 
Patent No. 6133423 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 



APPLICATION NUMBER: US/09/398,4 96 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-1996 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1607 base pairs 
TYPE: nucleic acid 
; STRANDEDNESS: single 

; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: Coding Sequence 
LOCATION: 79... 621 
OTHER INFORMATION: 
US-09-398-496-3 

Query Match 55.1%; Score 547.2; DB 3; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 4.7e-132; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 
Qy 371 
Db 2 
Qy 431 
Db 62 
Qy 491 
Db 122 
Qy 551 
Db 182 
Qy 611 
Db 242 
Qy 671 
Db 302 



CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 
I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGAT CCT GT GCACT GACT GCGCCACCC 61 

GGC C CAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAGCAAT C GCT GA 4 90 
I I I I I I M I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I 
GGCCCAAGCTGAAGAAGAT GAAGAGCCAGACAGGAGAGGT GGGT GAGAAGCAGTCGCTCA 121 

AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 
I I I I I I I I I I I I I M M II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

AGCT CAACC GCAGCC GAGAC AT T C GCAT CAAAT AT GGCAAC GGCAGAAAGAACT C AC GAC 610 
I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I 
AACT CAAC C GGAGT CGT GAT AT T C G CAT CAAGT AT GGCAAT GT CAGAAAGAACT CACGGC 241 

T AC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CTGC GAGGC CGAGAACA 67 0 

I I I I I I I I I I I I I I I II I I I I I I I II II MINIM I I I I I I I I I I I I I I I I I I 

TACAGTT CAACAAAGTGAGGGT GGAGGAT GCCGGGGAGTACGT CTGT GAGGCCGAGAACA 301 

T CCT GGGGAAGGACAC C GT C C GGGGC C GGCT T T AC GT CAACAG C GT GAGC AC C AC C CT GT 730 
INI I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I II I 
T CCT T GGGAAGGACAC C GT GAGGGGC C GACT CC AT GT CAACAGC GT GAGC AC C ACT CT GT 361 



Qy 

Db 



731 
362 



790 
421 



Qy 



791 GAG GC GT C T GCT ACT AC AT C GAGGGC AT CAAC CAG CT CT C CT G CAAAT GT C CAAAT GGAT 850 




Db 



422 GAGGC GT GT GCT ACT AC AT C GAGGGCAT CAACC AGCT C T C CT GCAAAT GTC CAAAC GGAT 481 



Qy 



851 T CTTC GGACAGAGAT GT T T GGAGAAACT GC CT T T GCGATT GT ACAT GC C AGAT C CTAAGC 910 




Db 



4 82 T CT T C GGACAGAGAT GTT T GGAGAAACT GC CTTT GCGAT T GTAC AT GCC AGAT C CTAAGC 541 



Db 



Qy 



911 AAAGT GT CCT GT GGGAT ACACCGGGGACAGGT GTCAGCAGTTCGCAAT GGT CAACTTCTC 970 

I I I I I I I I I I I I I I I I I I I I I I I I It I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
542 AAAGT GT C CT GT GGGAT ACAC CGGGGACAGGT GT CAGCAGT T C GCAAT GGT CAACT T CT C 601 



Qy 



Db 



971 CAAGCAC CTT GGATT T GAATTAAA 994 

I I I I I I I I I I I I I I I I I I I I I II 
602 CAAGCAC CTT GGAT T T GAAT T GAA 625 



RESULT 7 

US-08-753-007A-7 

; Sequence 7, Application US/08753007A 

; Patent No. 6074841 

; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
APPLICANT: Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
; ZIP: 02110-2804 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

; REFERENCE/DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 



TELEX : 

; INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1476 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
FEATURE: 

; NAME/ KEY: Coding Sequence 

LOCATION: 69. . .1475 . 

OTHER INFORMATION: 
US-08-753-007A-7 

Query Match 49.5%; Score 492; DB 3; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 8.3e-118; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C CT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT G 422 

I I I I I I I I I I I I I I I I I I I I I I || 

Db 98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

Qy 423 C GC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC C AGACGGGACAGGT GGGT GAGAAGC A 4 82 

I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 158 AGCCAC CC GGC CCAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGTGGGT GAGAAGC A 217 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I M I I II I I I I I I I I I I I I I I I I I I | | | | | | | | | | M I I I I I I I I I I I I I I I I I II I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 T GGCAAGGAGCT CAAC C GC AGC CGAGACATT C GCAT CAAAT AT GGCAAC GGCAGAAAGAA 602 

I I M I I I I I I I I II I I I I II I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 278 T GGCAAGGAG CT CAAC C GC AGC C GAGACAT TC G CAT CAAAT AT GGCAACGGCAGAAAGAA 337 

Qy 603 CT CACGACT ACAGTTCAACAAGGT GAAGGT GGAGGACGCT GGGGAGTATGT CTGCGAGGC 662 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | || M I I I I I I I I I 
Db 338 CT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGT AT GT CT GC GAGGC 397 

Qy 663 CGAGAACAT C CT GGGGAAGGACAC CGT C C GGGGC C GGCTT T AC GT CAACAGCGT GAGCAC 722 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 398 C GAGAACAT CCT GGGGAAGGACAC CGT C C GGGGC C GGCTT T AC GT CAAC AGC GT GAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II II I I I I I I I I I I I I I I 
Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT GGATT CT T C GGACAGAGAT GTT T GGAGAAACTGC CTT TGC GAT T GTACAT GC CAGA 902 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 578 AAAT GGATT CTT C GGACAGAGAT GTTT GGAGAAACTGC CT T TGC GAT T GTACAT GC CAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I II 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 8 
US-09-398-496-7 

; Sequence 7 , Application US/09398496 
; Patent No. 6133423 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 
; STATE: MA 

COUNTRY : US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE : FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398,4 96 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-1996 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR. SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1476 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

; NAME/KEY: Coding Sequence 

LOCATION: 69... 1475 

OTHER INFORMATION: 
US-09-398-496-7 



Query Match 49.5%; Score 492; DB 3; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 8.3e-118; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 



0; 



Qy ■ 363 C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT G 422 

M N I M III I Ml I I I I I I I I II 

Db 98 C C GCGGCAAGAAGCAC C C AGAGGGGAGGAAG C G GGAGAGGGAGCC C GAT C C C GGGGAGAA 157 

Qy 423 CGC CAC C C GGC C CAAGTT GAAGAAGAT GAAGAGCCAGAC GGGAC AGGT GGGT GAGAAGCA 4 82 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | || | | M I I I I I I I I I I I I I I I I I I I I 
Db 158 AGCCACCCGGCCCAAGTT GAAGAAGATGAAGAGCCAGACGGGACAGGT GGGT GAGAAGCA 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I M I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I | | | | | | | | | | | | | I I I I I I I I II I I 

Db 278 T G GCAAGGAGCT CAAC C GCAGC C GAGAC AT T C GCAT CAAAT AT GGCAACGGCAGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

M I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | || | | | | | | | | | | | | | | | | | | | | | | | | 
Db 338 CTC ACGACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CAC CCT GTCAT C CTGGT C GGGGCAC GCC C GGAAGT GCAAC GAGACAGC CAAGT C CT ATT G 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | 
Db 458 CAC C CT GT CAT C CT GGT C GGGGCAC GCC C GGAAGT GCAAC GAGAC AGCCAAGT CCT ATT G 517 

Qy 783 C GT CAAT GGAGGC GT C T GCTACTACAT CGAGGGCAT CAAC£AGCT CT C CT G CAAAT GTC C 842 

M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I MM I I I I I I I I I I I I I I I I I I 
Db 518 CGT CAAT GGAGGC GTCT GCT ACT AC AT CGAGGGCAT CAAC CAGCTCT CCT GCAAATGTCC 577 

Qy 843 AAAT G GATT CT T C GGAC AGAGAT GT TT G GAGAAACT GC CT T T G C GAT T GT AC AT GC C AGA 902 

M I I II I M I I I II I II II I M M II II M II II II II II II II M I I I M I II I II I II 

Db 578 AAAT GGAT T CTT C GGACAGAGAT GTTT GGAGAAACT GC CTTT GC GAT T GT ACAT GC CAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I II II II II II I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 9 

US-08-753-007A-31 

; Sequence 31, Application US/08753007A 

; Patent No. 6074841 

; ' GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
; TITLE OF INVENTION: AND USES THEREFOR 

- NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
; STATE: MA 



COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753, 007A 

FILING DATE: 19-NOV-1996 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: . 32,983 

REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

TELEFAX: 617-542-8906 

TELEX : 

; INFORMATION FOR SEQ ID NO: 31: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2268 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 

FEATURE: 

NAME/ KEY: Coding Sequence 

LOCATION: 69... 2009 
; OTHER INFORMATION: 

US-08-753-007A-31 

Query Match 49.5%; Score 492; DB 3; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 9.7e-118; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C CT C GAT ACCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GCACT GACT G 422 

I I I I I I I III I I I I I I I I I I I | M 

Db 9B C C GC GGCAAGAAGCAC C C AGAGGGGAGGAAGC GGGAGAGGGAGC C C GAT C C C GGGGAGAA 157 

QY 423 C GC CAC C CGGC C CAAGT T GAAGAAGAT GAAGAGCC AGAC GGGAC AGGT GGGT GAGAAGCA 482 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II | | | M I I I 
Db 158 AGC C AC C C GGC C CAAGT T GAAGAAGAT GAAGAGC CAGACGGGACAGGT GGGT GAGAAGCA 217 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | I I II I I I I I I I I I I I I I I I I I I | II M | | | 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

QY 543 T GGCAAGGAGCT CAAC C GCAGC C GAGACAT T C GC AT CAAAT AT GGCAAC GGCAGAAAGAA 602 

I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I II I I I I I I 
Db 278 T GGCAAGGAGCT CAAC C GCAGC C GAGACATT C GCAT CAAAT AT GGCAAC GGC AGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I I I I I I I I I I I I M I I I I I II I I I I I I II II I I I I I I I I I I I I I I I II I I I I I | | | 



Db 338 CT CACGACT AC AGT T CAACAAGGT GAAG GT GGAGGAC GCT GGGGAGT AT GT CT GCGAGGC 397 

Qy 663 C GAGAAC AT CCT GGGGAAGGACAC C GTC C GGGGC C GGCT TT AC GT CAACAGC GT GAG C AC 722 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I II I I I I I I M I I I I I I I I 
Db 4 58 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT GGAT T CTT C GGAC AGAGAT GT TT GGAGAAACT GCCTTT GC GATT GT ACAT GC CAGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 578 AAATGGAT T CT T C G GACAGAGAT GT TT GGAGAAACT GCCTTT GC GAT T GT AC AT GCCAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I II I I I I II 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 10 
US-09-398-496-31 

; Sequence 31, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

; APPLICANT : Gearing, David P. 

APPLICANT: Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398 , 4 96 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/ AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 



REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 31: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2268 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: Coding Sequence 

LOCATION: 69... 2009 

OTHER INFORMATION: 
US-09-398-496-31 



Query Match 49.5%; Score 492; DB 3; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 9.7e-118; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 



Hit 

Qy 


"3 CQ 


C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GTGCACT GACTG 

llll 1 II I II iiii i i • i it i ■ ■ . 


422 


Db 


98 


1 1 1 1 1 1 1 III 1 II 1 1 1 1 1 1 1 1 1 || 

C C GC GGCAAGAAGCAC C C AGAGGGGAGGAAGC GG GAGAGGGAGC C CGAT C C C GGGGAGAA 


157 


yy 


A 9 ^ 
ft £. s5 


UCjL.L,ACLUOGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCA 


482 






i i t i i i i i t i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i ■ i i i i i ■ i i i 
H 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 




JJO 


1 CO 

-LOO 


AGCCAC C C GGC CCAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAGCA 


217 


Qy 


483 


ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 


542 






1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 | M | 1 




Db 


218 


ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 


277 


Qy 


543 


T GGCAAGGAGCTCAACCGCAGCC GAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 


602 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I I I I I I I I I | | | 




Db 


278 


TGGCAAGGAGCT CAACCGCAGCCGAGACATT CGCAT CAAATATGGCAACGGCAGAAAGAA 


337 


Qy 


603 


CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 


662 






1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 I 1 1 1 | | 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


338 


CT CACGACTACAGTT CAACAAGGTGAAGGTGGAGGACGCT GGGGAGTATGTCTGC GAGGC 


397 


Qy 


663 


CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 


722 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


398 


C GAGAACAT C CT GG GGAAGGAC AC C GT C C GGGGC C GGCTT T AC GT CAAC AGC GT GAGC AC 


457 


Qy 


723 


CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 


782 






1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


458 


C AC C CT GT CAT C CT GGT C GGGGC ACG CC C GGAAGT GCAAC GAGAC AGCCAAGT C CT AT T G 


517 


Qy 


783 


CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 


842 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


518 


CGTC7VATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 


577 


Qy 


843 


AAAT GGAT T CTT C GGACAGAGAT GTT T GGAGAAACT GC CT T T GC GAT T GT AC AT GC CAGA 


902 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


578 


AAAT GGATT CTT C GGACAGAGAT GTT T GGAGAAACT GC CTT T GCGATTGT AC AT GC CAGA 


637 



Qy 

Db 



903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
638 TCCTAAGCAAAAAGCC 653 



RESULT 11 
US-08-753-007A-1 

; Sequence 1, Application US/08753007A 
; Patent No. 6074841 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
APPLICANT: Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

TITLE OF INVENTION: AND USES, THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
; ZIP: 02110-2804 

COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753 , 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 

TELEX: 

INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2467 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: circular 
; MOLECULE TYPE: cDNA 
FEATURE: 

; NAME/KEY: Coding Sequence 

LOCATION: 7 9... 18 93 

OTHER INFORMATION: 
US-08-753-007A-1 



Query Match 



47.1%; Score 467.8; DB 3; Length 2467; 



Best Local Similarity 91.3%; Pred. No. 1.8e-lll; 
Matches 4 96; Conservative 0; Mismatches 47; Indels 



0; Gaps 0; 



Qy 371 C CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGATC CT GT GCACT GACT GC GC C AC CC 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 2 CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 61 

Qy 431 GGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGTGAGAAGCAAT CGCT GA 4 90 

I I I I I I I I I I I I I I I I II I I M I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I 
Db 62 GGCCCAAGCT GAAGAAGAT GAAGAGCCAGACAGGAGAGGTGGGT GAGAAGCAGTCGCT CA 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCT CAAC C GC AGC CGAGAC ATT C GCAT CAAAT AT GGCAACGGCAGAAAGAACT CAC GAC 610 

I I I I I I I I I II II II 11111111111 I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 182 AACTCAACCGGAGT CGT GAT ATT CGCATCAAGTATGGCAAT GT CAGAAAGAACT CACGGC 241 

Qy 611 T ACAGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC C GAGAACA 670 

I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I II I I i I I I I I 
Db 242 T ACAGT T CAACAAAGT GAGGGT GGAGGAT GC C GGGGAGT AC GT CT GT GAGGCC GAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I M I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I II 
Db 302 T C CTT G GGAAGGAC AC C GT GAGGGGCC GACT C CAT GTCAAC AGCGT GAGC AC CACT CT GT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790' 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I 
Db 362 CAT C CT GGT CGGGACAT GC C C GGAAGT GCAAT GAGACC GC CAAGT C CT ACT GT GTGAAT G 421 

Qy 7 91 GAGGCGTCT GCT AC TACATCGAGGGC AT CAAC CAGCTCTCCTGCAAATGTC CAAAT GGAT 850 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 422 GAGGCGT GT GCT ACT ACAT C GAGGGC AT CAAC C AGCT CT C CT GCAAAT GT C CAAAC GGAT 481 

Qy 851 T CT T CGGACAGAGAT GTTT GGAGAAACT GC CT TT GC GAT T GT AC AT GCC AGAT C CTAAG C 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db . 482 T CTT CGGACAGAGAT GTTT GGAGAAACT GCCTTTGCGATT GTACAT GCCAGATCCTAAGC 541 

Qy 911 AAA 913 

III 

Db 542 AAA 544 



RESULT 12 
US-09-398-496-1 

; Sequence 1, Application US/09398496 
; Patent No. 6133423 
; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 



; CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398,4 96 

FILING DATE: 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 08/753,007 

FILING DATE: 19-NOV-1996 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
; TELEFAX: 617-542-8906 

; TELEX: 

; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2467 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
; TOPOLOGY: circular 

MOLECULE TYPE: cDNA 
FEATURE : 

; NAME/KEY: Coding Sequence 

LOCATION: 79... 1893 

OTHER INFORMATION: 
US-09-398-496-1 

Query Match 47.1%; Score 467.8; DB 3; Length 2467; 

Best Local Similarity 91.3%; Pred. No. 1.8e-lll; 

Matches 496; Conservative 0; Mismatches 47; Indels 0; Gaps 0; 

Qy 371 C CAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GC ACT GACT G C GC C AC C C 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 CT AAC GGCAAAAACAT CAAGAAAGAGGT GGG CAAGAT C CT GT GC ACT GACT GC GC C AC C C 61 

Qy 431 GGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGT GAGAAGCAATCGCT GA 490 

I I II II II I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I II I I I I I I 
Db 62 GGCCCAAGCT GAAGAAGAT GAAGAGCCAGACAGGAGAGGT GGGT GAGAAGCAGTCGCT CA 121 

Qy 4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II II II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 



Qy 



551 



AGCTCAAC CGCAGC CGAGAC AT T C GCAT CAAAT AT GGCAAC GGCAGAAAGAACT CAC GAC 610 

I I I I I I I I I II II II I I I I I I I I I I I MINIM I II I II I I II I I II II I 



Db 182 AACT CAAC C GGAGT CGT GAT AT T C GCAT CAAGT AT GGCAAT GT C AGAAAGAACT C AC GGC 241 

Qy 611 T ACAGT T CAACAAGGT GAAGGT GGAG GAG GCT GGGGAGT AT GT CT GC GAGGC C GAGAACA 67 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I.I I I 

Db 242 T ACAGT T CAACAAAGT GAGGGT GGAGGAT GC C GGGGAGT AC GT CT GT GAGGC C GAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I I I I I I I'l I I I II I I I I I I I I I I I I I I I I I I I I I I II I 
Db 302 T CCT T GGGAAGGAC AC C GT GAGGGGC CGACT C CAT GT CAAC AG C GT GAGCACCACT CT GT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 7 90 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I 
Db 362 CAT CCT GGT C GG GACAT GC C CGGAAGTGCAAT GAGAC C GCCAAGT CCT ACT GT GT GAAT G 421 

Qy 791 GAG GC GT CT GCT ACT AC AT C GAGGGCAT CAAC CAGCT CT CCT GCAAAT GT C CAAAT GGAT 850 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 422 GAGGC GT GT GCT ACT ACAT C GAGGGCAT CAAC CAGCT CT CCT GCAAAT GT C CAAACGGAT 481 

Qy 851 T CT T C GGACAGAGAT GT T T GGAGAAACT GC CT TT GC GAT T GT ACAT GC CAGAT C CTAAGC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M II 
Db 4 82 T CT T CGGACAGAGAT GT TT GGAGAAACT GC CTTT GC GAT T GT ACAT GC CAGAT C CTAAGC 541 

Qy 911 AAA 913 

III 

Db 542 AAA 544 



RESULT 13 

US-08-525-864A-5 

; Sequence 5, Application US/08525864A 

; Patent No. 5912326 

; GENERAL INFORMATION: 

; APPLICANT : Chang, Han 

; TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 

TITLE OF INVENTION: Related thereto 
; NUMBER OF SEQUENCES: 18 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: LAHIVE & COCKFIELD 

STREET: 28 State Street 

CITY: Boston 
; STATE: Massachusetts 

COUNTRY: USA 

ZIP: 02109 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: AscII (text) 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/525 , 864A 

FILING DATE: 8-SEP-1995 
; CLASSIFICATION: 530 

ATTORNEY/ AGENT INFORMATION: 
; NAME: Kara, Catherine J. 

; REGISTRATION NUMBER: 41,106 

; REFERENCE/ DOCKET NUMBER: HUI-017 

; TELECOMMUNICATION INFORMATION: 



TELEPHONE: (617)227-74 00 
; TELEFAX: (617)742-4214 

; INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1207 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 
; NAME/ KEY: CDS 

; LOCATION: 2.. 394 

US-08-525-864A-5 

Query Match 36.2%; Score 359.6; DB 2; Length 1207; 

Best Local Similarity 94.0%; Pred. No. "l.2e-83; 

Matches 374; Conservative 0; Mismatches 24; Indels 0; Gaps 0; 

Qy 597 AAAGAACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT G 656 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 AAAGAACT CAC GGCT AC AGTT CAACAAAGT GAAGGT GGAGGAC GCT GGAGAGT AC GT CT G 60 

Qy 657 CGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGT 716 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I MINIM || 
Db 61 TGAGGCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGT 120 

Qy 717 GAGCACCAC C CTGT CAT C CT GGT C GGGGCAC GC C CGGAAGT GCAAC GAGAC AGC CAAGT C 776 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 GAGCACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTC 180 

Qy 777 CTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAA 836 

M I II II I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CTACTGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAA 240 

Qy 837 AT GT C CAAAT GGATT CT T C GGACAGAGAT GTT T GGAGAAACT GC CT TT GC GATT GT AC AT 896 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 AT GT C CAAAC GGATT CT T C GGACAGAGAT GTT T GGAGAAACT GCCTTT GCGATT GTACAT 300 

Qy 897 GC CAGAT C CT AAGCAAAGT GTC CT GT GGGAT AC ACC GGGGACAGGT GT CAGCAGT T C GCA 956 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 GC CAGATC CT AAGCAAAGT GT C CT GT GGGAT AC ACC GGGGACAGGT GT CAGC AGTT C GCA 360 

Qy 957 AT GGT CAACT T CT C CAAGCACCT T GGAT TT GAATT AAA 994 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AT GGT CAACT T CT C CAAGCACCT T GGAT T T GAATT AAA 398 



RESULT 14 
US-08-525-864A-18 

; Sequence 18, Application US/08525864A 
; Patent No. 5912326 
; GENERAL INFORMATION: 

APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS : 



ADDRESSEE : LAHIVE & COCKFIELD 

STREET: 28 State Street 

CITY: Boston 
; STATE: Massachusetts 

COUNTRY: USA 

ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: AscII (text) 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/08/525, 864A 

FILING DATE: 8-SEP-1995 

CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
; NAME : Kara, Catherine J. 

REGISTRATION NUMBER: 41,106 

REFERENCE/ DOCKET NUMBER: HUI-017 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (617)227-7400 

TELEFAX: (617)7 42-4214 
; INFORMATION FOR SEQ ID. NO: 18: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 142 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 

US-08-525-864A-18 



Query Match 13.6%; Score 135.6; DB 2; Length 142; 

Best Local Similarity 97.2%; Pred. No. 4.2e-26; 

Matches 138; Conservative 0; Mismatches 4; Indels 0; Gaps 0; 

Qy 792 AGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGATT 851 

I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1 AGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGTCCAAACGGATT 60 

Qy 852 CTT C GGAC AGAGAT GTT T GGAGAAACT GCCTTTGC GAT TGT AC AT GC C AGAT C CTAAGC A 911 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 61 CTT C GGACAGAGAT GT T T GGAGAAACT GC CT T TGC GAT TGT ACAT GC CAGAT C CTAAGC A 12 0 

Qy 912 AAGT GT C CT GT GGGAT ACAC C G 933 

I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AAGTGTCCTGT GGGAT ACACCG 142 



RESULT 15 

US-08-036-555B-149 

; Sequence 149, Application US/08036555B 
; Patent No. 5530109 
; GENERAL INFORMATION: 

APPLICANT: Goodearl, Andrew; Stroobant, Paul; 

APPLICANT: Minghetti, Luisa; Waterfield, Michael; Marchioni, Mark; 

APPLICANT: Chen, Maio Su; Hiles, Ian 

TITLE OF INVENTION: Glial Mitogenic Factors, Their 



; TITLE OF INVENTION: Preparation and Use 
; NUMBER OF SEQUENCES; 184 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Felfe & Lynch 

; STREET: 805 Third Avenue 

CITY: New York City 

STATE: New York 

COUNTRY: USA 
; ZIP: 10022 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 5.25 inch, 360 kb storage 

COMPUTER: IBM 

OPERATING SYSTEM: PC-DOS 
; SOFTWARE: Wordperfect 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/ 08/036, 555B 

FILING DATE: 24-MAR-1993 

CLASSIFICATION: 4 35 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: ' 07/965,173 

FILING DATE: 23-OCT-1992 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/940,389 

FILING DATE: 03-SEP-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/907,138 
; FILING DATE: 30-JUN-1992 . 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
. FILING DATE: 03-APRIL-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: U.K. 91 07566.3 

FILING DATE: 10-APRIL-1991 
ATTORNEY/AGENT INFORMATION: 
; NAME : Tsai, Christine H. 

; REGISTRATION NUMBER: 34,266 

; REFERENCE/ DOCKET NUMBER: LUD 5250.4 

; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: (212) 688-9200 

TELEFAX: (212) 838-3884 
; INFORMATION FOR SEQ ID NO: 14 9: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 114 0 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
US-08-036-555B-149 

Query Match 9.6%; Score 95.4; DB 1; Length 1140; 

Best Local Similarity 50.7%; Pred. No. 2.2e-15; 

Matches 409; Conservative 0; Mismatches 361; Indels 37; Gaps 6; 

Qy 194 GGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCAGCGCG 253 

I I I I I I I I II I I I I I I Mill I I I I I I I I | 

Db 11 GGGCGGCGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGGCGCCT 7 0 



QY 



254 AGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTT 313 



Db 71 GGGGCCACCCCGCCTTCCCCTCCTGCGGGCGCCTCAAGGAGGACAGCAGGTACATCTTCT 130 

Qy 314 TCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACC- 372 

I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 131 TCATGGAGCCCGAGGCCAACAGCAGCGGCGGGCCCGGCCGCCTTCCGAGCCTCCTTCCCC 190 

Qy 373 AAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GC 423 

I III I I I I I I I I I I I I I I I II I I I I I I I I I I III 
Db 191 CCTCTCGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAACGG-TGC 24 9 

Qy 424 GCCACCCGGCCCAAGTTGAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGTGAGAAGCAA 4 83 

III I III I I II I I I I I I I I I I I I I I I II II I II I I 

Db 250 GCCTTGCCTCC C CGCTT GAAAGAGAT GAAGAGTCAGGAGT CT GT GGCAGGT T C CAAACT A 309 

Qy 484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 54 3 

Ml I I I I I I I II I II I II I I I II I I II I I 

Db 310 GT GCT T C GGTGC GAGAC C AGTT CT GAAT ACT C CT CT CT CAAGT T CAAGT GGT T CAAGAAT 369 

Qy 544 GGCAAGGAGCTCAACCG CAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

III II I I I I I III I I I I I I Ml I 

Db 370 GGGAGT GAATTAAGC CGAAAGAACAAAC C AGAAAAC AT CAAGAT AC AGAAAAGG CC GGGG 429 

Qy 601 AACT C AC GACTAC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 660 

I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I 

Db 430 AAGT C AGAACTT C GC AT TAG CAAAGC GT C ACT GGCT GAT T CT GGAGAAT AT AT GT GCAAA 489 

Qy 661 GC C GAGAACAT CCT GGGGAAGGACA CCGTCCGGGGCCGGCTTTACGTCAACAGC 714 

I I I I I I I I I I I I I I III I II I II 

Db 490 GT GAT CAGCAAACT AGGAAAT GACAGT GCCT CT GCCAACAT CACCATT GT GGAGTCAAAC 54 9 

Qy 715 GT GAGCAC C AC CCT GTCAT C CT GGT C GGGGC AC GC CC GGAAGT GCAACGAGACAGC CAAG 774 

I I I I I I III III III I I I I I I I I I I II 

Db 550 GC CACAT CC AC AT CT ACAGCT GGGACAAGC C AT CT T GT CAAGT GT GCAGAGAAGGAGAAA 609 

Qy 775 T C CTAT T GC GT CAAT GGAGGCGT CT GCT ACT ACAT C GAGGGC ATCAACCAGCT CTC 830 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 610 ACTTT CT GT GT GAAT GGAGGCGAGT GCT T CAT GGT GAAAGAC CTTT CAAAT CC CTCAAGA 669 

Qy 831 CT GCAAAT GT CCAAATGGAT T CT T C GGACAGAGAT GT T T GGAGAAACT GCCT TT G 885 

I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I II 
Db 67 0 TACTTGTGCAAGTGCCAACCTGGATT CACT GGAGCGAGAT GTACT GAGAAT GT GCCCAT G 72 9 

Qy 886 C GATT GT ACAT GC CAGAT C CTAAGCAAAGT GT C CT GT GGGAT ACAC C GGGGAC AGGT GT C 945 

II II I I I I I I I I I I I II I I I I I I I I I II I 

Db 730 AAAGT C C AAA C C C AAGAAAAGT GC C CAAAT GAGT T TACT GGT GAT C GCT GCC 781 

Qy 94 6 AGCAGTTCGCAATGGTCAACTTCTCCA 972 

I I I I I I I I I I I I I I I I I I I 

Db 782 AAAACTACGTAATGGCCAGCTTCTACA 808 
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Title: 

Perfect score: 
Sequence : 



US-09-864-675-1 
994 

1 atgaggcgcgacccggcccc . 



. caccttggatttgaattaaa 994 



Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 2324096 seqs, 1762381658 residues 

Total number of hits satisfying chosen parameters: 4648192 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Published_Applications_NA: * 

1: /cgn2_6/ptodata/l/pubpna/US07_PUBCOMB.seq: * 

2 : /cgn2_6/ptodata/l/pubpna/PCT_NEW_PUB. seq: * 

3: /cgn2_6/ptodata/l/pubpna/US06_NEW_PUB.seq: * 

4: /cgn2_6/ptodata/l/pubpna/US06_PUBCOMB.seq: * 

5: /cgn2_6/ptodata/l/pubpna/US07_NEW_PUB.seq: * 

6: /cgn2_6/ptodata/l/pubpna/PCTUS_PUBCOMB.seq:* 

7: /cgn2_6/ptodata/l/pubpna/US08_NEW_PUB.seq: * 

8: /cgn2_6/ptodata/l/pubpna/US08_PUBCOMB.seq: * 

9 : /cgn2_6/ptodata/ l/pubpna/US09A_PUBCOMB . seq : * 
10 : / cgn2_6/ptodata/ 1/pubpna/US 0 9B_PUBCOMB . seq : * 
11: /cgn2_6/ptodata/l/pubpna/US09C_PUBCOMB. seq: * 
12: /cgn2_6/ptodata/l/pubpna/US09_NEW_PUB. seq: * 
13: /cgn2_6/ptodata/l/pubpna/US09_NEW_PUB. seq2 : * 
14: /cgn2_6/ptodata/l/pubpna/US10A_PUBCOMB.seq:* 
15: /cgn2_6/ptodata/l/pubpna/US10B_PUBCOMB. seq:* 
16: /cgn2_6/ptodata/l/pubpna/USlO_NEW_PU t B. seq: * 
17 : /cgn2_6/ptodata/l/pubpna/US60_NEW_PUB. seq: * 
18: /cgn2_6/ptodata/l/pubpna/US60_PUBCOMB. seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 
US-09-864-675-1 

; Sequence 1, Application US/09864675 

; Patent No. US20020081286A1 

; GENERAL INFORMATION: 

; APPLICANT: Marchionni, Mark 



; TITLE OF INVENTION: NRG- 2 NUCLEIC ACID MOLECULES, 

; TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 

; FILE REFERENCE: 04585/049002 

; CURRENT APPLICATION NUMBER: US/ 09/8 64 , 675 

; CURRENT FILING DATE: 2001-05-23 

PRIOR APPLICATION NUMBER: US 60/206,495 
; PRIOR FILING DATE: 2000-05-23 
; NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 1 

LENGTH: 994 
; TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-864-675-1 

Query Match 100.0%; Score 994; DB 9; Length 994; 

Best Local Similarity 100.0%; Preci. No. 1.3e-279; 

Matches 994; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IN I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I II I 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

Qy 361 C C C CT CGAT ACCAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 420 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I E I I I I I I I I I 

Db 361 C C C CT C GAT ACCAACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GAC 42 0 

Qy 421 TGCGC CACCCGGC CCAAGTT GAAGAAGATGAAGAGCCAGACGGGACAGGT GGGT GAGAAG 4 8 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | 
Db 421 T GC GC CAC C CGGC C CAAGTT GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAG 48 0 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

M I I I I I I I I I I II I I I II I I I I I I I I I I I I IN I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 



Qy 



541 



GAT GGCAAGGAGCTCAACCGCAGCCGAGACATT CGCATCAAATAT GGCAACGGCAGAAAG 

I M I I I I II I II I I I I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I 



600 



Db 541 GAT GGCAAGGAGC T CAAC C GC AGC C GAGAC AT T C GC AT CAAAT AT GGCAACGGCAGAAAG 600 

Qy 601 AACT CAC GACTACAGTT CAACAAGGT GAAGGT GGAGGACGCT GGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 AACT CAC GACTACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGG GAGT AT GT CT GC GAG 660 

Qy 661 GC C GAGAACAT C C TGG GGAAGGACAC CGTCCGGGGCCGGCTT T AC GT CAAC AGC GT GAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I | M | | | | | | 
Db 721 AC CACCC T GT CAT CCT GGT C GGGGCAC GC C CGGAAGT GCAAC GAGACAGC CAAGT CCT AT 78 0 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 TGCGTCAAT GGAGGCGTCT GCTACTACAT CGAGGGCAT CAACCAGCT CTCCTGCAAAT GT 84 0 

Qy 841 C CAAAT GGATT CTT C GGAC AGAGAT GTT T GGAGAAACT GC CT TT GCGAT T GT ACATGC C A 900 

I I M I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I | | || | | | | | | | | | 
Db 841 C CAAAT GGAT T CTT C GGACAGAGAT GTT T GGAGAAACT GC CT TT GC GATT GTACAT GC CA 900 

Qy 901 GAT C CT AAGCAAAGT GT C CT GT GGGAT ACAC C GGGGACAGGT GT C AGCAGTT C GCAAT GG 960 

I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I 
Db 901 GATCCTAAGCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGG 960 

Qy 961 TCAACTTCTCCAAGCACCTTGGATTTGAATTAAA 994 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 961 T CAACT T CT C CAAGC ACCTT GGAT TTGAATTAAA 994 



RESULT 2 
US-10-096-241-5 

; Sequence 5, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/10/096, 241 

FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 



PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/ AGENT INFORMATION : 
NAME: Fasse f J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8 906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

; LENGTH: 1884 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/ KEY: Coding Sequence 
LOCATION: 664. . . 1883 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
US-10-096-241-5 

Query Match 90.5%; Score 900; DB 14; Length 1884; 

Best Local Similarity 99.3%; Pred. No. 4.1e-252; 

Matches 914; Conservative 0; -Mismatches 5; Indels 1; Gaps 1; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

M I I I I I I I I I I I M I I I M I I I I I I II I I I I I I M I | | || | | | || | M | M | | | | | | | | 
Db 278 TACT CGCCCAGCCTCAAGT CAGT GCAGGACCAGGCGT ACAAGGCACCCGTGGT GGT GGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | I I I I I I I I I I I I I I I I 
Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II 
Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 C C C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAG GT GGGCAAGAT C CT GT GC ACT GAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 578 C C C C T - GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GGC 636 



Qy 

Db 



421 
637 



T GC GC CAC CC GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAG 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T GC GC CAC C C GGC C CAAGT T GAAGAAGAT GAAGAG C CAGAC GGGACAGGT GGGT GAGAAG 



480 
696 



Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GAT GGCAAGGAGCT CAAC C GC AGC C GAGAC ATT C GC AT CAAAT AT GGCAAC GGCAGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

Db 757 GAT GGCAAGGAGCT CAAC C GCAGC C GAGACAT T C GCAT CAAAT AT GGCAAC GGC AGAAAG 816 

Qy 601 AACT C AC GACT AC AGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 817 AACT CAC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GCGAG 87 6 

Qy 661 GC C GAGAAC AT C CT GGGGAAGGACACC GT C CGGGGCC GG CTTT AC GT CAACAGC GT GAGC 720 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 877 GC C GAGAACAT C CT GGGGAAGGACACC GT C CGGGGCC GGCTT T AC GT CAACAGC GT GAGC 936 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 937 AC CAC C CT GT CAT C CT GGT C GGGGCAC GC C CGGAAGT GCAAC GAGACAGC CAAGT CCT AT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 997 T GC GT CAAT GGAG GCGT CT GCT ACT ACAT C GAGGGC AT CAAC CAGCT C T C CT GCAAAT GT 1056 

Qy 841 C CAAAT GGATT CT T C GGAC AGAGAT GT TT GGAGAAACT GC CT T T GCGATT GT AC ATGC CA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1057 C CAAAT GGATT CT T C GGACAGAGAT GT T T GGAGAAACT GC CTTT GCGAT T GT AC ATGCCA 1116 

Qy 901 GAT C CTAAGCAAAGT GT CCT 920 ■ 

I I I I I I I I I I II I III 
Db 1117 GAT C CTAAGCAAAAGC AC CT 1136 



RESULT 3 
US-09-864-675-3 

; Sequence 3, Application US/09864675 

; Patent No. US20020081286A1 

; GENERAL INFORMATION: 

; APPLICANT: Marchionni, Mark 

; TITLE OF INVENTION: NRG- 2 NUCLEIC ACID MOLECULES, 

; TITLE OF INVENTION: POLYPEPTIDES , AND DIAGNOSTIC AND THERAPEUTIC METHODS 

; FILE REFERENCE: 04585/049002 

; CURRENT APPLICATION NUMBER: US/09/ 8 64 , 675 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 60/206,495 

; PRIOR FILING DATE: 2000-05-23 

; NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 8 97 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-864-675-3 



Query Match 85.4%; Score 849; DB 9; Length 897; 

Best Local Similarity 98.3%; Pred. No. 2.4e-237; 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 0; 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I M I I I I > I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I 
Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 T ACT C GC C CAGC CT CAAGT C AGT GCAGGAC CAGGCGT ACAAGGCAC C C GT GGT GGT GGAG 120 

M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I 
Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCGAACAGCACCCGAGAGCCG 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I | | | | | | || | | | | | | | | | | 1 | | 
Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

Qy 361 C C C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT G C ACT GAC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 C C C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCT GT GC ACT GAC 420 

Qy 421 TGCGCCACCCGGC CCAAGTTGAAGAAGAT GAAGAGCCAGACGGGACAGGT GGGTGAGAAG 480 

I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 T GC GCCACC C GGC CCAAGTTGAAGAAGAT GAAGAGC CAGACG GGAC AGGT GGGT GAGAAG 480 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

Qy 541 GAT GGGAAGGAGCT CAAC CGC AG C CGAGAC ATTC GCAT CAAAT AT GGCAAC GGC AGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I 
Db 541 GAT GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAAT AT GGCAAC GGC AGAAAG 600 

Qy 601 AACT CACGACT ACAGTTCAACAAGGTGAAGGT GGAGGACGCT GGGGAGTAT GT CTGCGAG 660 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 AACT C AC GACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGTAT GT CT GC GAG 660 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I | I I I I I I I II I I I I I I I I I I I I I I I I 
Db 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 AC C AC CCT GT CAT C CT GGTC G GGGCAC GC C C GGAAGT GCAAC GAGACAGC CAAGT C CT AT 78 0 



Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATG^ 840 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 

Db 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 840 

Qy 841 C CAAAT GGAT T CT T CGGACAGAGAT GTT T GGAG 873 

II I I I I I I I I I I I I I I I I I 

Db 841 C CT GT GGGAT AC AC C GGGGACAGGT GT C AG C AG 873 



RESULT 4 
US-10-096-241-3 

; Sequence 3, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION; 
; APPLICANT: Gearing, David P. 

; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP : 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
; FILING DATE: 12-Mar-2002 

; , CLASSIFICATION: <Unknown> 

; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

; REFERENCE/ DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 1607 base pairs 

; TYPE: nucleic acid 

; STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME /KEY : Coding Sequence 
LOCATION: 79... 621 
OTHER INFORMATION: 



SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
US-10-096-241-3 



Query Match 55.1%; Score 547.2; DB 14; Length 1607; 

Best Local Similarity 92.3%; Pred. No. 3.3e-149; 

Matches 576; Conservative 0; Mismatches 48; Indels 0; Gaps 0; 

Qy 371 CCAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GCGC CAC C C 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 2 CTAACGGCAAAAAC AT CAAGAAAGAGGT GGGCAAGAT C CT GT GC ACT GACT GC GCCAC C C 61 

Qy 431 GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGACAGGT GGGT GAGAAGCAAT CGCT GA 4 90 

I I I I I I I I II I I I I I I I I I I I I I II I II I I III I I I I I II I I I I I I I I I I I I I I I 
Db 62 GGCC CAAGCT GAAGAAGAT GAAGAGC CAGAC AGGAGAGGT GGGT GAGAAGCAGT C GCT CA 121 

Qy 491 AGT GT GAGGC AGCAGC C GGTAAT CC C CAGC CT T CCT AC CGT T GGT T CAAGGAT GGCAAGG 550 

I I I I I I I I I I I I I II II II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCTCAACCGCAGCCGAGACATTCGCAT CAAAT ATGGCAACGGCAGAAAGAACT CACGAC 610 

I I I I I I I I I II II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 182 AACT CAAC C GGAGT CGT GAT ATT C GCAT CAAGT ATGGCAAT GTC AGAAAGAACT CAC GGC 241 

Qy 611 T ACAGT T CAACAAGGT GAAGGTGGAGGAC GCT GGGGAGT AT GTCT GC GAGGCCGAGAAC A 67 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I II I II I I I I I I I I 
Db 242 TACAGTT CAACAAAGT GAGGGTGGAGGAT GCCGGGGAGT ACGTCTGTGAGGCCGAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 
Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CAT CCT GGT CGGGGCACGCCCGGAAGTGCAACGAGACAGC CAAGT CCT AT TGCGTCAATG 7 90 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I 
Db 362 CAT CCT GGTC GGGAC AT GC C C GGAAGT GCAAT GAGAC C GC CAAGT C CT ACT GT GT GAAT G 421 

Qy 791 GAGGCGT CT GCT ACT ACAT C GAGGGC AT CAAC C AGCT CT C CT GCAAAT GT C CAAAT GGAT 850 

I I I I I I I I I I I I I I I I I II I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 422 GAGGCGT GTGCT ACT ACAT C GAGGGC AT CAAC C AGCT CT C CT GCAAAT GT C CAAACG GAT 481 

Qy 851 T CT T CG GACAGAGAT GTT T GGAGAAACT GCCT TT GC GATT GT ACAT GC C AGAT C CTAAGC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 482 T CT T CG GACAGAGAT GTT T GGAGAAACT GCCT TT GC GATT GT ACAT GC CAGAT C CTAAGC 541 

Qy 911 AAAGTGT CCT GTGGGATACACCGGGGACAGGT GT CAGCAGTT CGCAAT GGT CAACTT CT C 970 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 542 AAAGT GT C CT GT G GGAT AC AC CGGGGAC AGGT GT CAGCAGT T CG CAAT GGT CAACTT CT C 601 

Qy 971 CAAGCAC CTT GGAT T T GAAT TAAA 994 

I I I I I I I I I I I I I I I I I I I I I II 
Db 602 CAAGCAC CTT GGAT T T GAAT T GAA 625 



RESULT 5 
US-10-096-241-7 

; Sequence 7, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 



APPLICANT: Gearing, David P. 

Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
; COMPUTER: IBM. Compatible 

OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096,241 
; FILING DATE: 12-Mar-2002 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 

; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1476 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

; NAME/KEY: Coding Sequence 

LOCATION: 69... 14 7 5 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
US-10-096-241-7 

Query Match 49.5%; Score 492; DB 14; Length 1476; 

Best Local Similarity 92.8%; Pred. No. 4e-133; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 C CT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GG GCAAGAT C CT GT GC ACT GACT G 422 

I I I I I I I III I I I I II I I I I I I II 

Db 98 C CGC GGCAAGAAGCAC CC AGAGGG GAGGAAGCGGGAGAGG GAGC C C GAT CC CGGGGAGAA 157 

Qy 423 CGCCACCCGGCCCAAGTT GAAGAAGATGAAGAGCCAGACGGGACAGGT GGGTGAGAAGCA 482 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 158 AGC CAC C C GGC C CAAGTT GAAGAAGATGAAGAGCCAGAC GGGAC AGGT GGGTGAGAAGCA 217 



Qy 483 ATC GCT GAAGT GT GAGGC AGC AGC C GGTAAT C CC C AGC CT T CCT AC C GT T GGT T CAAGGA 542 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 54 3 T GGCAAGGAGCT CAAC C GC AGC C GAGAC ATT C GCAT CAAAT AT GGCAAC GGCAGAAAGAA 602 

I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 27 8 T GGCAAGGAGCT CAAC C GC AGC C GAGAC AT T C GCAT CAAAT AT GGCAAC GGCAGAAAGAA 337 

Qy 603 CT C AC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GC GAGGC 662 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 338 CT C AC GACT ACAGTT CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT CT GCGAGGC 397 

Qy 663 CGAGAAC AT C CT GGGGAAGGACAC C GT C C GGGGC C GGCTT T AC GTCAAC AGC GT GAGCAC 722 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 398 CGAGAAC AT C CT GGGGAAGGACAC CGT C C GGGGC C GGCTT T AC GTCAAC AGC GT GAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 7 83 CGT CAAT GGAGGC GT CT GCT ACT AC AT C GAGGGCAT CAAC C AGCTCT C CT GCAAAT GTC C 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 AAAT GGAT T CTT C GGACAGAGAT GTTT GGAGAAACT GC CTT T GCGATT GT ACAT GC CAGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 57 8 AAAT GGAT T CTT C GGACAGAGAT GTTT G GAGAAACT GC CTT T GCGATT GT AC AT GCCAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 6 

US-10-096-241-31 

; Sequence 31, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33. 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP : 02110-2804 
COMPUTER READABLE FORM: . 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 



APPLICATION/NUMBER: US/10/096, 241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION : 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 31: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2268 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME /KEY : Coding Sequence 
LOCATION: 69... 2 009 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
US-10-096-241-31 

Query Match 49.5%; Score 492; DB 14; Length 2268; 

Best Local Similarity 92.8%; Pred. No. 4.6e-133; 

Matches 516; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 363 CCT C GAT AC CAAC GGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT C CT GT G C AC T GACT G 422 

II II I II III I I I I I I I I I I I I II 

Db 98 CC GC GGCAAGAAGCAC C CAGAGGGGAGGAAGC GGGAGAGGGAGCCC GAT C C CGG GGAGAA 157 

Qy 423 CGC C ACC CGGC C CAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTG GGTGAGAAGCA 4 82 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 158 AGC C AC C C GGC C CAAGTT GAAGAAGAT GAAGAGC C AGACGGGAC AGGTGGGTGAGAAGCA 217 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGT^ 602 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 27 8 T GGCAAG GAGCT CAAC C GCAG CC GAGAC ATT C G CAT CAAAT AT GGCAAC GG C AGAAAGAA 337 

Qy 603 CT C ACGACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGG GAGT AT GT CT GC GAGGC 662 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 338 CT C ACGACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GG GGAGT AT GT CT GC GAGGC 397 

Qy . 663 CGAGAACAT C CT GGGGAAGGACAC C GT C C G GGGC CGGCTT T AC GT CAAC AGCGT GAGCAC 722 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 398 CGAGAACAT C CT GGGGAAGGACACC GT C CGGGG C CGGCTT T AC GT CAAC AGCGT GAGCAC 457 



Qy 



723 



CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 



782 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGG7\AGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGT CAAT GGAGG C GT CT GCT ACT ACAT CGAGGGCAT CAAC C AGCT C T C CT GCAAAT GT C C 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 518 CGT CAAT GGAGGC GT CT GCTACT ACAT C GAGGGC AT CAAC C AG CT CT C CT G CAAAT GT C C 577 

Qy 843 AAAT GGAT T CTT C GGACAGAGAT GTT T GGAGAAACT GC CT T T G C GATT GT ACAT GC CAGA 902 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 57 8 AAAT GGATT CTT C GGACAGAGAT GTT T GGAGAAACT GC CT TT GC GATT GT ACAT GC CAGA 637 

Qy 903 TCCTAAGCAAAGTGTC 918 

I I I I I I I I I I I I I 
Db 638 TCCTAAGCAAAAAGCC 653 



RESULT 7 
US-10-096-241-1 

; Sequence 1, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing/ David P. 

; Bus field, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

STATE : MA 
COUNTRY: US 
ZIP : 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022 001 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-542-5070 

; TELEFAX: 617-542-8906 

; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 1: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 2467 base pairs 

; TYPE: nucleic acid 



STRANDEDNESS : single 
TOPOLOGY: circular 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/ KEY: Coding Sequence 
LOCATION: 79... 1893 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-10-096-241-1 

Query Match 47.1%; Score 467.8; DB 14; Length 2467; 

Best Local Similarity 91.3%; Pred. No. 5.5e-126; 

Matches 4 96; Conservative 0; Mismatches 47; Indels 0; Gaps 0; 

Qy 371 C CAAC GGCAAAAAT CT CAAGAAAGAGGT GG GCAAGAT CCT GTGC ACT GACT GCGC C ACC C 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 2 CT AAC GGCAAAAACAT CAAGAAAGAGGT GGGCAAGAT CCT GTGC ACT GACT GCGC CACC C 61 

Qy 431 GGC C CAAGT T GAAGAAGAT GAAGAGC CAGAC GGGAC AGGT GGGT GAGAAGCAAT C GCT GA 490 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I II I I I I I I I I II I I 
Db 62 GGC CCAAGCT GAAGAAGAT GAAGAGC CAGACAGGAGAGGT GGGT GAGAAGCAGT CGCTCA 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II II II I I I II I I I I I I I I II I I I I I I I I I I I I I I II I I I 
Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCT CAAC C GC AGC C GAGACATT C GCAT CAAAT AT GGCAAC GGC AGAAAGAACT C ACGAC 610 

I I I I I I I I I II II II II I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 182 AACT CAAC CGGAGT C GT GAT ATT C GC AT CAAGT AT GGCAAT GT C AGAAAGAACT C AC GGC 241 

Qy 611 T ACAGT T CAACAAGGT GAAGGTGGAGGAC GCT GGGGAGT AT GT CT GCGAGGC CGAGAACA 67 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 242 T AC AGT T CAACAAAGT GAGGGTGGAGGAT GC C GGGGAGT AC GT CT GT GAGGC CGAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 
Db 302 T CCT T GGGAAGGACAC CGT GAGGGGC C GACT C CAT GT CAAC AGC GT GAGC AC C ACT CTGT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I II I I I I I II II I I I I 
Db 362 CAT CCT GGT C GGGACAT GCC C GGAAGT GCAAT GAGAC CGC CAAGT C CT ACT GT GT GAAT G 421 

Qy 791 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGAT 850 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 422 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 481 

Qy 851 T CTT C GGAC AGAGAT GT TT GGAGAAACT GCCT T TG C GAT T GT AC AT GC C AGAT C CTAAGC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 482 T CTT CGGAC AGAGAT GT TT GGAGAAACT GCCT T TGC GAT T GT AC AT GC C AGAT C CTAAGC 541 

Qy 911 AAA 913 

III 

Db 542 AAA 544 



RESULT 8 

US-10-029-386-26613 



Sequence 26613, Application US/10029386 
Publication No. US20030194704A1 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G. 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME-DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 
CURRENT APPLICATION NUMBER: US/ 10/ 02 9 , 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 26613 
LENGTH: 201 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

MAP TO CHR5.3 

EXPRESSED IN HEART, SIGNAL =0.55 
EXPRESSED IN ADULT LIVER, SIGNAL = 0.4 9 
EXPRESSED IN FETAL LIVER, SIGNAL = 0.72 
EXPRESSED IN BRAIN, SIGNAL =0.66 
SWISSPROT HIT: 014511, EVALUE 3.00e-29 
NT HIT: AF119152.1, EVALUE 1.00e-109 
EST HUMAN HIT: BF108794.1, EVALUE 3.00e-93 



OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION : 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 
US-10-029-386-2 6613 



Query Match 17.4%; Score 173; DB 13; Length 201; 

Best Local Similarity 100.0%; Pred. No. 2.5e-40; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 



Db 



424 GC CAC C C GGC C CAAGTT GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAGCAA 4 83 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
27 GC CAC C C GGC C CAAGTT GAAGAAGAT GAAGAGC C AGAC GGGAC AGGT GGGT GAGAAGCAA 8 6 



Qy 

Db 

Qy 

Db 



4 84 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I 

87 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 14 6 

544 GGCAAGGAGCT CAACCGCAGCCGAGACATT CGCATCAAATAT GGCAACGGCAG 596 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
147 GGCAAGGAGCT CAACCGCAGCCGAGACATTCGCATCAAATAT GGCAACGGCAG 199 



RESULT 9 

US-10-029-38 6-12913 

; Sequence 12913, Application US/10029386 

; Publication No. US20030194704A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel, David K. 

; TITLE OF INVENTION: HUMAN GENOME- DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

; TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 



FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/ 10/ 02 9 , 38 6 

CURRENT FILING DATE: 2001-12-20 

NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 12913 
LENGTH : 573 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE: 

MAP TO CHR5 . 3 

EXPRESSED IN HEART, SIGNAL =0.55 
EXPRESSED IN ADULT LIVER, SIGNAL = 0.49 
EXPRESSED IN FETAL LIVER, SIGNAL =0.72 
EXPRESSED IN BRAIN, SIGNAL =0.66 
SWISSPROT HIT: 014511, EVALUE 2.00e-28 
NT HIT: AF119152.1, EVALUE 0.00e+00 
EST HUMAN HIT: BG996653.1, EVALUE l.OOe- 



QTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
OTHER INFORMATION: 
US-10-029-38 6-12 913 



-108 



Query Match 17.4%; Score 173; DB 13; Length 573; 

Best Local Similarity 100.0%; Pred. No. -3.4e.--4 0; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



0; 



Qy 

Db 



424 GCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 4 83 

I I I I I I I I I I I I I I I I IN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
377 GCCACCCGGCCCAAGTT GAAGAAGAT GAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 436 



Qy 

Db 

Qy 

Db 



4 84 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I I I I I I I I I I I I I I I I 1 I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 37 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 4 96 



544 



596 



GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG 
I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 97 GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG 54 9 



RESULT 10 

US-10-02 9-386-2532/c 

; Sequence 2532, Application US/10029386 

; Publication No. US20030194704A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME- DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

; TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
; FILE REFERENCE: AEOMICA-X-2" 

CURRENT APPLICATION NUMBER: US/10/029, 386 
; CURRENT FILING DATE: 2001-12-20 
; NUMBER OF SEQ ID NOS: 34288 

; SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
; SEQ ID NO 2532 

LENGTH: 57 9 

TYPE: DNA 
; ORGANISM: Homo sapiens 



FEATURE : 



OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
OTHER 
US-10-029- 



IN FORMAT I ON : 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
INFORMATION: 
386-2532 



MAP TO CHR5.1 

EXPRESSED IN HEART, SIGNAL =2.4 
EXPRESSED IN BONE MARROW, SIGNAL - 3.9 
EXPRESSED IN ADULT LIVER, SIGNAL =3.5 
EXPRESSED IN PLACENTA, SIGNAL =4.3 
EXPRESSED IN HELA, SIGNAL =5.2 
EXPRESSED IN FETAL LIVER, SIGNAL =3.4 
EXPRESSED IN BRAIN, SIGNAL =4.6 
NT HIT: AF119153.1, EVALUE 0.00e+00 
EST_HUMAN HIT: BF108794.1, EVALUE 2 . 00e-57 
SWISSPROT HIT: 014511, EVALUE 2.00e-12 



Query Match 11.4%; 
Best Local Similarity 92.9%; 
Matches 130; Conservative 



Score 113.6; DB 13; 
Pred. No. 7.4e-23; 
0; Mismatches 9; 



Length 579; 
Indels 1; 



Gaps 



l; 



Qy 



Db 



594 CAGAAAGAACT C ACGACT ACAGT T CAACAAGGT GAAGGT GGAGGACGCT GGGGAGT AT GT 653 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I | 
48 9 CAGAAAGAACT CACGACT ACAGT T CAACAAGGT GAAGGT GGAGGACGCT GGGGAGT AT GT 430 



Qy 



Db 



654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCG-TCCGGGGCCGGCTTTACGTCAACA 712 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
429 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGCTCCGGGGCCGGCTTTACGTCAACA 370 



Qy 713 GC GT GAGC AC C AC C CT GT CA 732 

III I I I I I I I I 

Db 369 GCGGTAGGTGGGCCCAGACA 350 



RESULT 11 

US-10-02 9-386-16232/C 

; Sequence 16232, Application US/10029386 

; Publication No. US20030194704A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel, David K. 

; TITLE OF INVENTION: HUMAN GENOME-DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

; TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
; FILE REFERENCE: AEOMICA-X- 2 

; CURRENT APPLICATION NUMBER: US/ 10/ 02 9, 38 6 
; CURRENT FILING DATE: 2001-12-20 
; NUMBER OF SEQ ID NOS : 34288 

; SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
; SEQ ID NO 16232 

LENGTH: 171 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE : 

; OTHER INFORMATION: MAP TO CHR5 . 1 

OTHER INFORMATION: EXPRESSED IN HEART, SIGNAL =2.4 

; OTHER INFORMATION: EXPRESSED IN BONE MARROW, SIGNAL =3.9 
OTHER INFORMATION: EXPRESSED IN ADULT LIVER, SIGNAL =3.5 
OTHER INFORMATION: EXPRESSED IN PLACENTA, SIGNAL =4.3 



OTHER INFORMATION: EXPRESSED IN HELA, SIGNAL =5.2 

OTHER INFORMATION: EXPRESSED IN FETAL LIVER, SIGNAL =3.4 

OTHER INFORMATION: EXPRESSED IN BRAIN, SIGNAL =4.6 

OTHER INFORMATION: NT HIT: AF119153.1, EVALUE 7.00e-87 

OTHER INFORMATION: SWISSPROT HIT: 014511, EVALUE 3.00e-13 

OTHER INFORMATION: EST_HUMAN HIT: BF108794.1, EVALUE 5.00e-58 
US-10-02 9-38 6-16232 



Query Match 11.2%; 
Best Local Similarity 97.6%; 
Matches 124; Conservative 



Score 111.8; DB 13; 
Pred. No. 1.7e-22; 
0; Mismatches 2; 



Length 171; 
Indels 1; 



Gaps 



1; 



Qy 



Db 



594 CAGAAAGAACT CACGACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT 653 

I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
130 CAGAAAGAACT CACGACT AC AGT T CAACAAGGT GAAGGT GGAGGAC GCT GGGGAGT AT GT 71 



Qy 

Db 

Qy 

Db 



654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCG-TCCGGGGCCGGCTTTACGTCAACA 712 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t II I I I I 
70 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGCTCCGGGGCCGGCTTTACGTCAACA 11 

713 GCGTGAG 719 
Mill 
10 GCGGTAG 4 



RESULT 12 
US-08-736-019-149 

Sequence 149, Application US/08736019 
Publication No. US20030207799A1 
GENERAL INFORMATION: 

APPLICANT: Goodearl, Andrew 
APPLICANT: Stroobant, Paul 
APPLICANT: Minghetti, Luisa 
APPLICANT: Waterfield, Michael 
APPLICANT: Marchionni, Mark 
APPLICANT: Chen, Mario 
APPLICANT: Hiles, Ian 

TITLE OF INVENTION: GLIAL MITOGENIC FACTORS, THEIR 
TITLE OF INVENTION: PREPARATION AND USE 
NUMBER OF SEQUENCES: 189 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Clark & Elbing LLP 
STREET: 176 Federal Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: U.S.A. 
ZIP: 02110 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 
COMPUTER: IBM Compatible Pentium 
OPERATING SYSTEM: Windows95 
SOFTWARE: FastSeq Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/736, 019 
FILING DATE: 22-OCT-1996 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER: 08/471,833 

FILING DATE: 06-JUN-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/036,555 

FILING DATE: 24-MAR-1993 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 07/965,173 

FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 07/907,138 

FILING DATE: 30-JUN-1992 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 07/940,389 

; FILING DATE: 03-SEP-1992 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
; FILING DATE: 03-APR-1992 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: UK 91 07566.3 
; FILING DATE: 10-APR-1991 

; ATTORNEY/AGENT INFORMATION: 

; NAME: Bieker-Brady, Kristina 

REGISTRATION NUMBER: 39,109 
; REFERENCE/ DOCKET NUMBER: 04585/00200Q 

; TELECOMMUNICATION INFORMATION : 

; TELEPHONE: (617) 428-0200 

TELEFAX: (617) 428-7045 

TELEX : 

; INFORMATION FOR SEQ ID NO: 14 9: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 1140 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
US-08-736-019-149 



Query Match 9.6%; Score 95.4; DB 7; Length 1140; 

Best Local Similarity 50.7%; Pred. No. 1.9e-17; 

Matches 409; Conservative 0; Mismatches 361; Indels 37; Gaps 6; 

Qy 194 GGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCAGCGCG 253 

I I I I I I I I II I I I I I I Mill I II I I III I 

Db 11 GGGCGGCGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGGCGCCT 70 

Qy 254 AGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTT 313 

I I II I I I I I I I I I I I I I III I I I I I I II I I 

Db 71 GGGGCCACCCCGCCTTCCCCTCCTGCGGGCGCCTCAAGGAGGACAGCAGGTACATCTTCT 130 



Qy 314 TCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACC- 372 

I I I I I I I I I I II I I II I I I I MM III 

Db 131 TCATGGAGCCCGAGGCCAACAGCAGCGGCGGGCCCGGCCGCCTTCCGAGCCTCCTTCCCC 190 

Qy 373 AAC GGCAAAAAT CT CAAGAAAGAGGT GG G CAAGAT CCT GT GCACT GACT GC 423 

I III II II I I I I I I I I I I I II I I I I I I II I I II I 

Db 191 CCTCTCGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAACGG-TGC 249 



Qy 



42 4 GC C AC C C GG C CCAAGT T GAAGAAGAT GAAGAGC C AGAC GGGACAGGT GGGT GAGAAGCAA 483 



Ill I III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II II I 1 1 I I 

Db 250 GCCTTGCCTCCCCGCTTGAAAGAGATGAAGAGTCAGGAGTCTGTGGCAGGTTCCAAACTA 309 

Qy 484 T C GCT GAAGT GT GAG GC AGCAGC C GGTAAT CC C CAGC CT T CCT AC C GT T G GTT CAAGGAT 543 

III I I I I I I I II I II I II I I I I I I I I I II 

Db 310 GT GCTT C GGT GC GAGAC C AGTT CT GAAT ACT C CT CTCT CAAGT T CAAGT GGTT CAAGAAT 369 

Qy 544 GGCAAGGAGCTCAACCG CAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I I I I I I I I I I III I I II I I III I 

Db 37 0 GGGAGT GAAT TAAGC CGAAAGAACAAACC AGAAAACAT CAAGAT AC AGAAAAGGC CGGGG 429 

Qy 601 AACT CACGACTACAGTT CAACAAGGTGAAGGT GGAGGACGCTGGGGAGT ATGT CT GCGAG 660 

I I I I I II I I I I I I I I I III II I I I I I I I II I II I I 

Db 430 AAGT CAGAACTT CGC AT T AGCAAAGC GT CACT GGCT GAT T CT GGAGAAT AT AT GT GCAAA 48 9 

Qy 661 GC C GAGAACAT C CT GGGGAAGGACA CCGTCCGGGGCCGGCTTTACGTCAACAGC 714 

I I I I I I I I I I I I I I Ml I M I II 

Db 490 GT GAT C AGCAAACT AGGAAAT GACAGT GC CT CT GC C AAC AT CAC CAT T GT GGAGT CAAAC 54 9 

Qy 715 GT GAGC AC C ACC CT GT CAT CCT GGT C GGGGCAC GC C C GGAAGT GCAAC GAGACAGCCAAG 774 

I I I I IE Ml III I I I I I I I I I I I I I II 

Db 550 GCCACAT CCACAT CTACAGCTGGGACAAGCCATCTT GT CAAGT GT GCAGAGAAGGAGAAA 609 

Qy 775 TCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTC 830 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 610 ACT T T CT GT GTGAAT GGAGGC GAGT GCT T CAT GGT GAAAGACCTT T CAAAT CC CT CAAGA 669 

Qy 831 CT GCAAAT GT CCAAAT GGAT T CTT CGGAC AGAGAT GT T T GGAGAAACT GCCT TT G 885 

I I I I I II I I I I I I I I I III I I I I I I I I I I I I - I I I I II 

Db 670 T ACTT GT GCAAGT GC CAAC CT GGAT T CACT GGAGC GAGAT GT ACT GAGAAT GT GCC CAT G 72 9 

Qy 88 6 CGAT T GT AC ATGC CAGAT C CTAAGCAAAGT GT C CT GT GGGAT AC AC CGGGGACAGGT GT C 945 

II II I I I I I I I I I I I I I I I I I I I I I I M I 

Db 730 AAAGT C C AAA- C C CAAGAAAAGT GCCCAAAT GAGTT T ACT G GT GAT CGCTGC C 781 

Qy 94 6 AGCAGTTCGCAATGGTCAACTTCTCCA 972 

I I I I I I I I I I II I II M II 
Db 782 AAAACTACGTAATGGCCAGCTTCTACA 8 08 



RESULT 13 
US-09-366-886-55 

Sequence 55, Application US/09366886 
Publication No. US20030040465A1 
GENERAL INFORMATION: 
APPLICANT: Gywnne, David I. 
APPLICANT: Mahanthappa, Nagesh K. 
APPLICANT: Marchionni, Mark A. 
APPLICANT: Bermingham-McDonogh, Olivia 
APPLICANT: Goldin, Stanley M. 
APPLICANT: McBurney, Robert N. 

TITLE OF INVENTION: USE OF NEUREGULINS AS MODULATORS OF 
TITLE OF INVENTION: CELLULAR COMMUNICATION 
FILE REFERENCE: 04585/041005 
CURRENT APPLICATION NUMBER: US/ 09/366, 886 
CURRENT FILING DATE: 1999-08-04 
PRIOR APPLICATION NUMBER: US 08/341,018 



PRIOR FILING DATE: 1994-11-17 
; NUMBER OF SEQ ID NOS : 87 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 55 

LENGTH: 114 0 

TYPE: DNA 
; ORGANISM: Bos taurus 
; FEATURE : 

NAME/ KEY: CDS 
; LOCATION: (1) . . . (840) 
; OTHER INFORMATION: n is unknown. 
; OTHER INFORMATION: Xaa is unknown. 
US-09-366-886-55 



Query Match 9.6%; Score 95.4; DB 11; Length 1140; 

Best Local Similarity 50.7%; Pred. No. 1.9e-17; 

Matches 4 09; Conservative 0; Mismatches 361; Indels 37; Gaps 6; 

Qy 194 GGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCAGCGCG 253 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 11 GGGCGGCGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGGCGCCT 70 

Qy 254 AGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTT 313 

I III I I I I I I I I I I I I I III I I I I I I I I I I 

Db 71 GGGGCCACCCCGCCTTCCCCTCCTGCGGGCGCCTCAAGGAGGACAGCAGGTACATCTTCT 130 

Qy 314 TCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACC- 372 

I I I I II I I I I II I I II I I I I I I I I III 

Db 131 TCATGGAGCCCGAGGCCAACAGCAGCGGCGGGCCCGGCCGCCTTCCGAGCCTCCTTCCCC 190 

Qy 373 AACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGC 423 

I III I I I I I I I I I I I I I I I II I I I I I I I I I I III 
Db 191 CCTCTCGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAACGG-TGC 24 9 

Qy 424 GCCAC CCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGT GGGT GAGAAGCAA 483 

III I III I I I I I I I I I I I I I I I I I I I II III I I I I 

Db 250 GC CT TGCCTCCC C GCT T GAAAGAGATGAAGAGT CAGGAGT CTGT GGCAGGTT CCAAACT A 309 

Qy 484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 54 3 

III I I I I I I I II III I II I I II I I I I I II 

Db 310 GT GCT T C GGT GC GAGAC CAGT T CT GAAT ACT C CTCT CTCAAGT T CAAGT GGT T CAAGAAT 369 

Qy 544 GGCAAGGAGCTCAACCG — - C AGC C GAGACATT CGCAT CAAATAT GGCAAC GGC AGAAAG 600 

II I II I I III III I I I I I I III I 

Db 370 GGGAGT GAATT AAGCC GAAAGAACAAAC CAGAAAACAT CAAGAT AC AGAAAAGGC C GGGG 42 9 



Qy 601 AACT CACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTAT GT CT GCGAG 660 

Mill I I I I I I III I I III M I I I I I I I I I I I M I 

Db 430 AAGT C AGAACTT C GCATT AGCAAAGC GT C ACT GGCT GAT T CT GGAGAAT AT AT GT GCAAA 48 9 



Qy 661 GCCGAGAACATCCTGGGGAAGGACA CCGTCCGGGGCCGGCTTTACGTCAACAGC 714 

I I I I II I I I I I I I I III I II I II 

Db 4 90 GTGATCAGCAAACTAGGAAATGACAGTGCCTCTGCCAACATCACCATTGTGGAGTCAAAC 54 9 



Qy 715 GTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAG 774 

I I I I I I III III III I I I I I I I I I I II 

Db 550 GC CAC AT C C AC AT CT AC AGCT GGGACAAGC C AT CT T GT CAAGT GT GC AGAGAAGGAGAAA 609 



Qy 775 T CCT AT T GCGT CAAT GGAGG C GT CT GCT ACT ACAT CGAGGGC AT CAAC CAG CT CT C 830 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 610 ACTTT CT GT GT GAAT GGAGGC GAGT GCT T C AT GGT GAAAGAC CTT T CAAAT C C CT CAAGA 669 

Qy 831 CT GCAAAT GT CCAAAT GGAT T CT T C G GAC AGAGAT GT TT GGAGAAACT GC CTT T G 885 

I I I I I I I I I I I I I II I III I I I I I I I I I I I I I I I I II 

Db 67 0 T ACT T GT GCAAGT GC CAACCT GGAT T C ACT GGAGC GAGAT GT ACT GAGAAT GT GC C C AT G 729 



Qy 886 CGATT GTACAT GCCAGAT CCTAAGCAAAGT GT CCT GT GGGATACACCGGGGACAGGT GTC 945 

II I I I I I I I I I I I I I I I I I Mill I I I I 

Db 730 AAAGT CCAAA CC CAAGAAAAGT GC CCAAAT GAGTTTACT GGT GAT C GCT GCC 7 81 



Qy 94 6 AGCAGTTCGCAATGGTCAACTTCTCCA 972 

I I I I I I I I I I I I I I I II I I 
Db 7 82 AAAACT AC GTAAT GGC CAGCT T CT ACA 808 



RESULT 14 
US-08-736-019-134 

Sequence 134, Application US/08736019 
Publication No. US20030207799A1 
GENERAL INFORMATION: 

APPLICANT: Goodearl, Andrew 
APPLICANT: Stroobant, Paul 
APPLICANT: Minghetti, Luisa 
APPLICANT: Waterfield, Michael 
APPLICANT: Marchionni, Mark 
APPLICANT: Chen, Mario 
APPLICANT: Hiles, Ian 

TITLE OF INVENTION: GLIAL MITOGENIC FACTORS, THEIR 
TITLE OF INVENTION: PREPARATION AND USE 
NUMBER OF SEQUENCES: 189 
CORRESPONDENCE ADDRESS: 

.ADDRESSEE: Clark & Elbing LLP 
STREET: 176 Federal Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: U.S.A. 
ZIP : 02110 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 
COMPUTER: IBM Compatible Pentium 
OPERATING SYSTEM: Windows95 
SOFTWARE: FastSeq Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/736, 019 
FILING DATE: 22-OCT-1996 
CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/471,833 
FILING DATE: 06-JUN-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/036,555 
FILING DATE: 24-MAR-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/965,173 



; FILING DATE: 23-OCT-1992 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 07/907,138 

FILING DATE: 30-JUN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/940,389 

FILING DATE: 03-SEP-1992 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 

FILING DATE: 03-APR-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: UK 91 07566.3 

FILING DATE: 10-APR-19 91 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Bieker-Brady, Kristina 

REGISTRATION NUMBER: 39,109 

REFERENCE/ DOCKET NUMBER: 04585/00200Q 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (617) 428-0200 
; TELEFAX: (617) 428-7045 

TELEX: 

; INFORMATION FOR SEQ ID NO: 134: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1193 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
US-08-736-019-134 

Query Match 9.4%; Score 93.4; DB 7; Length 1193; 

Best Local Similarity 50.6%; Pred. No. 7.3e-17; 

Matches 408; Conservative 0; Mismatches 361; Indels 38; Gaps 6; 

Qy 194 GGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCAGCGCG 2 53 

I I I I I I I I II I I I II I I I I I I I I I I I I I I I 

Db 18 GGGCGGCGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGGCGCCT 77 

Qy 254 AGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTT 313 

I I II I I I I I I I I I I I I I III I I I I I I I I I I 

Db 7 8 GGGGCCACCCCGCCTTCCCCTCCTGCGGGCGCCTCAAGGAGGACAGCAGGTACATCTTCT 137 

Qy 314 TCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACC- 372 

I I I I I I I I I I III I II I I I I I I I I III 

Db 138 TCATGGAGCCCGAGGCCAACAGCAGCGGCGGGCCCGGCCGCCTTCCGAGCCTCCTTCCCC 197 

Qy 373 AACGGCAAAAAT CT CAAGAAAGAGGT GGGCAAGAT CCTGT GCACTGACT GC 423 

I I M I I I I I I I I II I I I I I II I I I I I I I II I III 
Db 198 CCTCTCGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAACGG-TGC 2 56 

Qy 424 GCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 4 83 

III I III I I I I I I I I I I I II I I I I I I II III I I I I 

Db 257 GCCTTGCCTCCCCGCTTGAAAGAGATGAAGAGTCAGGAGTCTGTGGCAGGTTCCAAACTA 316 



Qy 

Db 



484 
317 



543 
376 



Qy 544 GGCAAGGAGCTCAACCG CAGC C GAGACATT CGCAT CAAAT AT GGCAAC GGCAGAAAG 600 

II I II I I III III I I I I I I III I 

Db 377 GGGAGT GAAT T AAGC C GAAAGAACAAAC CAGAAAACAT CAAGAT ACAGAAAAGGC CGGGG 436 

Qy 601 AACT C ACGACT AC AGT T CAACAAGGT GAAG GT GGAGGAC G CT GGGGAGT AT GT CT GCGAG 660 

I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I 

Db 437 AAGT CAGGACT T C GCAT T AGCAAAGC GT C ACT GGCT GATT CT GGAGAAT AT AT GT GCAAA 4 96 

Qy 661 GCC GAGAAC AT C CT GGGGAAGGACA CCGTCCGGGGCCGGCTTTACGTCAACAGC 714 

I I I I I I I I I I I I I I III I III II 

Db 497 GT GATCAGCAAACTAGGAAATGACAGTGCCTCTGCCAACAT CACCATT GT GGAGTCAAAC 556 

Qy 715 GT GAGCAC C AC CCT GT CAT C CT GGT CGGGGC ACGCC C GGAAGT GCAAC GAGACAGCCAAG 774 

I I I I I I III III III I I I I I I I I I I II 

Db 557 GCCACAT CCACAT CTACAGCTGGGACAAGCCATCTT GT CAAGT GTGCAGAGAAGGAGAAA 616 

Qy 775 TCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCC 831 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 617 ACTTTCTGTGTGAATGGAGGCGAGTGCTTCATGGTGAAAGACCTTTCAAATCCCTCAAGA 676 

Qy 832 T GCAAAT GT C CAAAT GGATT CTT C GGAC AGAGAT GT T T GGAGAAACT GC CTT T G 8 85 

Mill II I I I I II II I III II I III I Mill MM II 
Db 677 T ACTT GT GCAAGT GCCAAC CT GGATT CACT GGAGCGAGAT GT ACT GAGAAT GT GC CCAT G 736 

Qy 886 CGATTGTACATGCCAGATCCT7^\GCAAAGTGTCCTGTGGGATACACCGGGGACAGGTGTC 945 

II II III II II I I I I I I II II II I I I I 

Db 737 AAAGTCCAAACCCAAG AAAGT GCC CAAAT GAGTTTACT GGT GAT CGCT GCC 787 

Qy 946 AGCAGTTCGCAATGGTCAACTTCTCCA 972 

I I I II II II I I I I II II II 
Db 788 AAAACT AC GTAAT GGC CAGCT TCT ACA 814 



RESULT 15 
US-09-366-886-3 

Sequence 3, Application US/09366886 
Publication No. US20030040465A1 
GENERAL INFORMATION: 
APPLICANT: Gywnne, David I. 
APPLICANT: Mahanthappa, Nagesh K. 
APPLICANT: Marchionni, Mark A. 
APPLICANT : Bermingham-McDonogh, Olivia 
APPLICANT: Goldin, Stanley M. 
APPLICANT: McBurney, Robert N. 

TITLE OF INVENTION: USE OF NEUREGULINS AS MODULATORS OF 
TITLE OF INVENTION: ^CELLULAR COMMUNICATION 
FILE REFERENCE: 04585/041005 
CURRENT APPLICATION NUMBER: US/09/366,88 6 
CURRENT FILING DATE: 1999-08-04 
PRIOR APPLICATION NUMBER: US 08/341,018 
PRIOR FILING DATE: 1994-11-17 
NUMBER OF SEQ ID NOS : 87 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 
LENGTH: 1193 
TYPE: DNA 

ORGANISM: Bos taurus 



US-09-366-886-3 



Query Match 9.4%; Score 93.4; DB 11; Length 1193; 

Best Local Similarity 50.6%; Pred. No. 7.3e-17; 

Matches 408; Conservative 0; Mismatches 361; Indels 38; Gaps 6; 

Qy 194 GGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCAGCGCG 253 

I I I I I I I I II I I I I I I I 1 I I I I I I I I I I I I 

Db 18 GGGCGGCGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGGCGCCT 77 

Qy 254 AGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTT 313 

I I II I I I I I I I I I I I I I III I I I I I I I I I I 

Db 78 GGGGCCACCCCGCCTTCCCCTCCTGCGGGCGCCTCAAGGAGGACAGCAGGTACATCTTCT 137 

Qy 314 TCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACC- 372 

I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 138 TCATGGAGCCCGAGGCCAACAGCAGCGGCGGGCCCGGCCGCCTTCCGAGCCTCCTTCCCC 197 

Qy 373 AACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGC 423 

I III I I I I I I I I I I I I I I I II I I I I I I I I I I III 
Db 198 CCTCTCGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAACGG-TGC 256 

Qy 424 GC C AC CCGGC CCAAGTT GAAGAAGAT GAAGAGC CAGAC GGGACAGGTGGGT GAGAAGCAA 483 

Ml I I I I I I I I I I I I II I I I I I I I I I I I III I I I I 

Db 257 GC CT T GC CT C CC C GCTT GAAAGAGAT GAAGAGT CAGGAGT CT GT G GCAGGTTC CAAACT A 316 

Qy 4 84 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

Ml I I I I I I I II I II I II I I I I I I I I I I I 

Db 317 GT G CT T C GGT GC GAGAC CAGT T CT GAAT ACT CCT CT CT CAAGT T CAAGT GGTT CAAGAAT 37 6 

Qy 544 GGCAAGGAGCTCAACCG C AGC C GAGAC AT T C GCAT CAAAT AT GGCAAC GGC AGAAAG 600 

II I II II III III I I I I II III I 

Db 377 GGGAGTGAAT T AAGC CGAAAGAACAAAC C AGAAAAC AT CAAGAT AC AGAAAAGG CC GGGG 436 

Qy 601 AACT CAC GACT ACAGTT CAACAAGGT GAAGGTGGAGGAC GCT GGGGAGT AT GT CT GCGAG 660 

I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I 

Db 437 AAGT CAGGACTT C GCAT T AGCAAAGCGT CACT GGCT GATT CTGGAGAAT AT AT GT GCAAA 496 

Qy 661 GC C GAGAACAT CCT GGGGAAGGAC A CCGTCCGGGGCCGGCTTTACGTCAACAGC 714 

I I I I I I I I I I I I I I III I II I II 

Db 497 GT GAT CAGCAAACT AGGAAAT GACAGT GC CT CT GC CAACAT CAC CATT GT GGAGT CAAAC 556 

Qy 715 GT GAGC AC CAC C CT GT CAT C CT GGT CGGGGCAC GC C CGGAAGT GCAAC GAGAC AGC CAAG 774 

I I I I I I II I III III I I I I I MM I II 

Db 557 GC CACAT C CAC AT CT AC AG CT GGGACAAGC CAT CT T GT CAAGT GT GCAGAGAAGGAGAAA 616 

Qy 775 T C CT ATT G CGT CAAT G GAGGC GT CT GCT ACT ACAT C GAGGGCAT CAAC CAGCT CT C C 831 

II I I II II II I I II II I I I I I I I I I I I I I I I 

Db 617 ACTTT CT GT GT GAAT GGAGGC GAGT GCT T CAT GGT GAAAGACCT T T CAAAT CC CT CAAGA 676 

Qy 832 T GCAAAT GT C CAAAT GGAT T CT T C GGAC AGAGAT GTTT GGAGAAACT G C CTT T G 885 

M I II I I I I I I M I I I III I II I I II I I I II I II I II 
Db 677 T ACT T GT GCAAGT GC CAAC CT GGAT T CACT GGAGC GAGAT GT ACT GAGAAT GT GCC CAT G 736 

Qy 886 C GATT GT AC AT GC CAGAT CCTAAGCAAAGT GT C CT GTG GGAT AC AC C GGGGACAGGT GT C 945 

II II III I II I I I I I I I I I II I I I II I 

Db. 737 AAAGT C CAAAC CCAAG AAAGT GC CCAAAT GAGTT T ACT GGT GAT C GCT GC C 787 



Qy 946 AGCAGTT CGCAAT GGTCAACTT CT CCA 972 

I I I I I I I I I I I I I I I I I I I 

Db 788 AAAACTACGTAATGGCCAGCTTCTACA 814 



Search completed: January 14, 2004, 14:11:36 
Job time : 392.249 sees 



